菜单

🤖 系统
📄 Abstract - ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensive. We show that small orchestrators managing other models and a variety of tools can both push the upper bound of intelligence and improve efficiency in solving difficult agentic tasks. We introduce ToolOrchestra, a method for training small orchestrators that coordinate intelligent tools. ToolOrchestra explicitly uses reinforcement learning with outcome-, efficiency-, and user-preference-aware rewards. Using ToolOrchestra, we produce Orchestrator, an 8B model that achieves higher accuracy at lower cost than previous tool-use agents while aligning with user preferences on which tools are to be used for a given query. On HLE, Orchestrator achieves a score of 37.1%, outperforming GPT-5 (35.1%) while being 2.5x more efficient. On tau2-Bench and FRAMES, Orchestrator surpasses GPT-5 by a wide margin while using only about 30% of the cost. Extensive analysis shows that Orchestrator achieves the best trade-off between performance and cost under multiple metrics, and generalizes robustly to unseen tools. These results demonstrate that composing diverse tools with a lightweight orchestration model is both more efficient and more effective than existing methods, paving the way for practical and scalable tool-augmented reasoning systems.

顶级标签: llm agents model training
详细标签: tool orchestration reinforcement learning efficient inference tool-augmented reasoning model coordination 或 搜索:

工具交响乐:通过高效的模型与工具编排提升智能 / ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration


1️⃣ 一句话总结

这篇论文提出了一种名为ToolOrchestra的方法,通过训练一个小型的‘指挥家’模型来协调调用各种智能工具,从而在解决复杂任务时,以更低的成本实现了比大型语言模型(如GPT-5)更高的性能和效率,并且能更好地满足用户偏好。


📄 打开原文 PDF