从智能体轨迹中归纳推理原语 / Inducing Reasoning Primitives from Agent Traces
1️⃣ 一句话总结
该论文提出一种新方法,通过分析AI智能体(如ReAct模式)在解决问题时留下的操作记录,自动提取出高频、可复用的推理步骤,并将其转化为一套简洁的“伪工具”库,从而让AI在后续任务中更稳定地调用这些推理套路,显著提升了在逻辑推理、规则应用和规划等复杂任务上的准确率,甚至超越了人工设计的专家方案。
ReAct-style LLM agents often rediscover the same reasoning routines across problems, yet leave those routines trapped in transient scratchpads. We introduce Reasoning Primitive Induction, a single-pass method that mines successful ReAct traces, clusters recurrent reasoning moves, and converts the most frequent moves into a compact library of typed pseudo-tools. Each pseudo-tool is specified by a natural-language docstring interpreted by an LLM at invocation time, and a standard ReAct loop composes these primitives at test time. The central result is that induced libraries outperform the very agent that generated their traces: by +44pp on RuleArena NBA (30 -> 74), +30pp on MuSR team allocation (38 -> 68), and +22pp on NatPlan meeting planning (7 -> 29). Across five comparable subtasks spanning narrative deduction, rule application, and constraint-satisfaction planning, a single fixed configuration improves over zero-shot Chain-of-Thought on every subtask, matches or surpasses expert-authored decompositions, and outperforms AWM at lower average inference cost.
从智能体轨迹中归纳推理原语 / Inducing Reasoning Primitives from Agent Traces
该论文提出一种新方法,通过分析AI智能体(如ReAct模式)在解决问题时留下的操作记录,自动提取出高频、可复用的推理步骤,并将其转化为一套简洁的“伪工具”库,从而让AI在后续任务中更稳定地调用这些推理套路,显著提升了在逻辑推理、规则应用和规划等复杂任务上的准确率,甚至超越了人工设计的专家方案。
源自 arXiv: 2606.02994