CodeComp:面向智能体编码的结构化KV缓存压缩 / CodeComp: Structural KV Cache Compression for Agentic Coding
1️⃣ 一句话总结
这篇论文提出了一种名为CodeComp的无需训练的KV缓存压缩框架,它通过结合静态程序分析来智能保留代码中的关键结构信息,从而在内存受限的情况下,让大语言模型在进行故障定位和代码生成等任务时,既能大幅压缩缓存占用,又能保持接近完整上下文的高准确率。
Agentic code tasks such as fault localization and patch generation require processing long codebases under tight memory constraints, where the Key-Value (KV) cache becomes the primary inference bottleneck. Existing compression methods rely exclusively on attention signals to estimate token importance, systematically discarding structurally critical tokens such as call sites, branch conditions, and assignments that are essential for code understanding. We present CodeComp, a training-free KV cache compression framework that incorporates static program analysis into LLM inference via Code Property Graph priors extracted by Joern. Across bug localization and code generation benchmarks, CodeComp consistently outperforms attention-only compression baselines under equal memory budgets, recovering the majority of full-context accuracy under aggressive KV cache compression, while matching the patch generation quality of uncompressed full-context inference and integrating seamlessly into SGLang-based agentic coding pipelines without model modification.
CodeComp:面向智能体编码的结构化KV缓存压缩 / CodeComp: Structural KV Cache Compression for Agentic Coding
这篇论文提出了一种名为CodeComp的无需训练的KV缓存压缩框架,它通过结合静态程序分析来智能保留代码中的关键结构信息,从而在内存受限的情况下,让大语言模型在进行故障定位和代码生成等任务时,既能大幅压缩缓存占用,又能保持接近完整上下文的高准确率。
源自 arXiv: 2604.10235