DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

ArXi:2601.03559v2 Announce Type: replace Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models but remains vulnerable to exposure bias and error accumulation, as early mistakes propagate irreversibly through autoregressive decoding. In this work, we propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process.