Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

📄 Abstract - Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

Chain-of-Thought (CoT) prompting improves LLM accuracy on complex tasks but often increases token usage and inference cost. Existing "Budget Forcing" methods reducing cost via fine-tuning with heuristic length penalties, suppress both essential reasoning and redundant filler. We recast efficient reasoning as a lossy compression problem under the Information Bottleneck (IB) principle, and identify a key theoretical gap when applying naive IB to transformers: attention violates the Markov property between prompt, reasoning trace, and response. To resolve this issue, we model CoT generation under the Conditional Information Bottleneck (CIB) principle, where the reasoning trace Z acts as a computational bridge that contains only the information about the response Y that is not directly accessible from the prompt X. This yields a general Reinforcement Learning objective: maximize task reward while compressing completions under a prior over reasoning traces, subsuming common heuristics (e.g., length penalties) as special cases (e.g., uniform priors). In contrast to naive token-counting-based approaches, we introduce a semantic prior that measures token cost by surprisal under a language model prior. Empirically, our CIB objective prunes cognitive bloat while preserving fluency and logic, improving accuracy at moderate compression and enabling aggressive compression with minimal accuracy drop.

推理即压缩：通过条件信息瓶颈统一预算强制 / Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

1️⃣ 一句话总结

这篇论文提出将大语言模型中的思维链推理视为一个压缩问题，通过一种新的条件信息瓶颈训练目标，在减少推理过程长度的同时，能更智能地保留关键逻辑信息，从而在控制计算成本的同时保持甚至提升任务准确性。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要