DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

📄 Abstract - DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models but remains vulnerable to exposure bias and error accumulation, as early mistakes propagate irreversibly through autoregressive decoding. In this work, we propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process. DiffCoT integrates diffusion principles at the reasoning-step level via a sliding-window mechanism, enabling unified generation and retrospective correction of intermediate steps while preserving token-level autoregression. To maintain causal consistency, we further introduce a causal diffusion noise schedule that respects the temporal structure of reasoning chains. Extensive experiments on three multi-step CoT reasoning benchmarks across diverse model backbones demonstrate that DiffCoT consistently outperforms existing CoT preference optimization methods, yielding improved robustness and error-correction capability in CoT reasoning.

DiffCoT：大语言模型中的扩散风格思维链推理 / DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

1️⃣ 一句话总结

这篇论文提出了一种名为DiffCoT的新方法，它将思维链推理过程类比为图像去噪，通过迭代修正推理步骤中的错误，从而显著提升了大型语言模型在解决复杂数学问题时的准确性和鲁棒性。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要