用于长程推理的递归模型 / Recursive Models for Long-Horizon Reasoning
1️⃣ 一句话总结
这篇论文提出了一种让AI模型通过递归调用自身来解决复杂长程问题的创新方法,理论上证明了该方法能突破现有模型处理长文本的限制,并在实验中成功训练了一个小模型,使其在复杂的逻辑推理任务上超越了更强大的主流大语言模型。
Modern language models reason within bounded context, an inherent constraint that poses a fundamental barrier to long-horizon reasoning. We identify recursion as a core principle for overcoming this barrier, and propose recursive models as a minimal realization, where the model can recursively invoke itself to solve subtasks in isolated contexts. We prove that any computable problem admits a recursive decomposition in which each subtask requires only exponentially smaller active context than standard autoregressive models; this strictly surpasses any context management approach confined to a single sequence, such as summarization. We further generalize our framework to modern agentic systems with arbitrary context processing and control flows, and prove that recursive models can achieve optimal power within this broader class. Experimentally, we train a 3B model to reason recursively and evaluate on Boolean satisfiability, a task requiring long-horizon combinatorial search, where it significantly outperforms frontier LLMs.
用于长程推理的递归模型 / Recursive Models for Long-Horizon Reasoning
这篇论文提出了一种让AI模型通过递归调用自身来解决复杂长程问题的创新方法,理论上证明了该方法能突破现有模型处理长文本的限制,并在实验中成功训练了一个小模型,使其在复杂的逻辑推理任务上超越了更强大的主流大语言模型。
源自 arXiv: 2603.02112