Ares:面向高效大语言模型智能体的自适应推理努力选择框架 / Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents
1️⃣ 一句话总结
这篇论文提出了一个名为Ares的框架,它通过一个轻量级路由器动态地为智能体任务的每一步选择最低必要的推理强度,从而在几乎不影响任务成功率的前提下,显著降低了大语言模型智能体的推理成本。
Modern agents powered by thinking LLMs achieve high accuracy through long chain-of-thought reasoning but incur substantial inference costs. While many LLMs now support configurable reasoning levels (e.g., high/medium/low), static strategies are often ineffective: using low-effort modes at every step leads to significant performance degradation, while random selection fails to preserve accuracy or provide meaningful cost reduction. However, agents should reserve high reasoning effort for difficult steps like navigating complex website structures, while using lower-effort modes for simpler steps like opening a target URL. In this paper, we propose Ares, a framework for per-step dynamic reasoning effort selection tailored for multi-step agent tasks. Ares employs a lightweight router to predict the lowest appropriate reasoning level for each step based on the interaction history. To train this router, we develop a data generation pipeline that identifies the minimum reasoning effort required for successful step completion. We then fine-tune the router to predict these levels, enabling plug-and-play integration for any LLM agents. We evaluate Ares on a diverse set of agent tasks, including TAU-Bench for tool use agents, BrowseComp-Plus for deep-research agents, and WebArena for web agents. Experimental results show that Ares reduces reasoning token usage by up to 52.7% compared to fixed high-effort reasoning, while introducing minimal degradation in task success rates.
Ares:面向高效大语言模型智能体的自适应推理努力选择框架 / Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents
这篇论文提出了一个名为Ares的框架,它通过一个轻量级路由器动态地为智能体任务的每一步选择最低必要的推理强度,从而在几乎不影响任务成功率的前提下,显著降低了大语言模型智能体的推理成本。
源自 arXiv: 2603.07915