自压缩语言模型智能体 / Self-Compacting Language Model Agents
1️⃣ 一句话总结
本文提出一种名为SelfCompact的新方法,让语言模型智能体在执行复杂任务时能自主决定何时以及如何压缩自身冗长的推理轨迹和工具调用记录,从而在显著降低计算成本的同时,使任务性能媲美甚至超越传统固定间隔压缩策略,并揭示了模型自身难以感知上下文“腐烂”这一认知缺陷。
Long agent traces composed of chains of thought and tool calls accumulate stale content that anchor subsequent generations, and eventually outgrow the context window. Existing scaffolds mitigate it with fixed-interval compaction triggered at a token threshold. Such triggers pay no heed to trajectory structure, risking discard of partial results mid-derivation or mid-search. We propose SelfCompact, a scaffold that allows the model itself to decide when and how to compact. Specifically, it pairs two inference-time elements: (i) a compaction tool the model invokes to summarize the accumulated context, and (ii) a lightweight rubric specifying when to fire (a sub-task has resolved, or the trajectory is converging) and when to suppress (mid-derivation, or when stuck). Both are needed. The tool alone is unevenly used across open-weight models, often invoked at unhelpful moments or not at all; the rubric alone cannot act. Together, they elicit effective adaptive compaction without any fine-tuning or external supervision. We present empirical results on six benchmarks (competitive math and agentic search) and seven models. Our results show that SelfCompact matches or exceeds fixed-interval summarization at a fraction of the token cost, improving over a no-summarization baseline by up to 18.1 points on math and 5-9 points on agentic search at 30-70% lower per-question cost. Our results expose a meta-cognitive gap: although unprompted models cannot reliably tell when their own context is rotting, a lightweight rubric closes this gap, reframing when to compact as a capability that scaffolds can supply without training.
自压缩语言模型智能体 / Self-Compacting Language Model Agents
本文提出一种名为SelfCompact的新方法,让语言模型智能体在执行复杂任务时能自主决定何时以及如何压缩自身冗长的推理轨迹和工具调用记录,从而在显著降低计算成本的同时,使任务性能媲美甚至超越传统固定间隔压缩策略,并揭示了模型自身难以感知上下文“腐烂”这一认知缺陷。
源自 arXiv: 2606.23525