菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-09
📄 Abstract - The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Large language models (LLMs) often fail to learn effective long chain-of-thought (Long CoT) reasoning from human or non-Long-CoT LLMs imitation. To understand this, we propose that effective and learnable Long CoT trajectories feature stable molecular-like structures in unified view, which are formed by three interaction types: Deep-Reasoning (covalent-like), Self-Reflection (hydrogen-bond-like), and Self-Exploration (van der Waals-like). Analysis of distilled trajectories reveals these structures emerge from Long CoT fine-tuning, not keyword imitation. We introduce Effective Semantic Isomers and show that only bonds promoting fast entropy convergence support stable Long CoT learning, while structural competition impairs training. Drawing on these findings, we present Mole-Syn, a distribution-transfer-graph method that guides synthesis of effective Long CoT structures, boosting performance and RL stability across benchmarks.

顶级标签: llm theory model training
详细标签: chain-of-thought reasoning fine-tuning knowledge distillation entropy convergence 或 搜索:

思维分子结构:绘制长链思维推理的拓扑图 / The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning


1️⃣ 一句话总结

这篇论文提出,大语言模型有效的长链思维推理过程类似于稳定的分子结构,由三种相互作用构成,并基于此发现开发了一种新方法来合成这种结构,从而显著提升了模型的推理性能和训练稳定性。

源自 arXiv: 2601.06002