语言模型数学推理的舍恩菲尔德解剖学 / Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
1️⃣ 一句话总结
这篇论文提出了一个名为ThinkARM的框架,它像给AI的思考过程做“解剖”一样,将大语言模型解决数学问题的推理步骤分解为分析、探索、执行、验证等功能模块,从而揭示了不同模型思考方式的本质差异和关键步骤对解题正确性的影响。
Large language models increasingly expose reasoning traces, yet their underlying cognitive structure and steps remain difficult to identify and analyze beyond surface-level statistics. We adopt Schoenfeld's Episode Theory as an inductive, intermediate-scale lens and introduce ThinkARM (Anatomy of Reasoning in Models), a scalable framework that explicitly abstracts reasoning traces into functional reasoning steps such as Analysis, Explore, Implement, Verify, etc. When applied to mathematical problem solving by diverse models, this abstraction reveals reproducible thinking dynamics and structural differences between reasoning and non-reasoning models, which are not apparent from token-level views. We further present two diagnostic case studies showing that exploration functions as a critical branching step associated with correctness, and that efficiency-oriented methods selectively suppress evaluative feedback steps rather than uniformly shortening responses. Together, our results demonstrate that episode-level representations make reasoning steps explicit, enabling systematic analysis of how reasoning is structured, stabilized, and altered in modern language models.
语言模型数学推理的舍恩菲尔德解剖学 / Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
这篇论文提出了一个名为ThinkARM的框架,它像给AI的思考过程做“解剖”一样,将大语言模型解决数学问题的推理步骤分解为分析、探索、执行、验证等功能模块,从而揭示了不同模型思考方式的本质差异和关键步骤对解题正确性的影响。
源自 arXiv: 2512.19995