超越数量:面向代码智能体的轨迹多样性扩展 / Beyond Quantity: Trajectory Diversity Scaling for Code Agents
1️⃣ 一句话总结
这篇论文提出了一个名为TDScaling的新框架,它通过提升训练数据的多样性而非单纯增加数据量,来更有效地训练能使用工具的代码生成AI,使其在保持编程能力的同时,更好地掌握复杂工具的使用。
As code large language models (LLMs) evolve into tool-interactive agents via the Model Context Protocol (MCP), their generalization is increasingly limited by low-quality synthetic data and the diminishing returns of quantity scaling. Moreover, quantity-centric scaling exhibits an early bottleneck that underutilizes trajectory data. We propose TDScaling, a Trajectory Diversity Scaling-based data synthesis framework for code agents that scales performance through diversity rather than raw volume. Under a fixed training budget, increasing trajectory diversity yields larger gains than adding more trajectories, improving the performance-cost trade-off for agent training. TDScaling integrates four innovations: (1) a Business Cluster mechanism that captures real-service logical dependencies; (2) a blueprint-driven multi-agent paradigm that enforces trajectory coherence; (3) an adaptive evolution mechanism that steers synthesis toward long-tail scenarios using Domain Entropy, Reasoning Mode Entropy, and Cumulative Action Complexity to prevent mode collapse; and (4) a sandboxed code tool that mitigates catastrophic forgetting of intrinsic coding capabilities. Experiments on general tool-use benchmarks (BFCL, tau^2-Bench) and code agent tasks (RebenchT, CodeCI, BIRD) demonstrate a win-win outcome: TDScaling improves both tool-use generalization and inherent coding proficiency. We plan to release the full codebase and the synthesized dataset (including 30,000+ tool clusters) upon publication.
超越数量:面向代码智能体的轨迹多样性扩展 / Beyond Quantity: Trajectory Diversity Scaling for Code Agents
这篇论文提出了一个名为TDScaling的新框架,它通过提升训练数据的多样性而非单纯增加数据量,来更有效地训练能使用工具的代码生成AI,使其在保持编程能力的同时,更好地掌握复杂工具的使用。
源自 arXiv: 2602.03219