From Growing to Looping: A Unified View of Iterative Computation in LLMs

📄 Abstract - From Growing to Looping: A Unified View of Iterative Computation in LLMs

Looping, reusing a block of layers across depth, and depth growing, training shallow-to-deep models by duplicating middle layers, have both been linked to stronger reasoning, but their relationship remains unclear. We provide a mechanistic unification: looped and depth-grown models exhibit convergent depth-wise signatures, including increased reliance on late layers and recurring patterns aligned with the looped or grown block. These shared signatures support the view that their gains stem from a common form of iterative computation. Building on this connection, we show that the two techniques are adaptable and composable: applying inference-time looping to the middle blocks of a depth-grown model improves accuracy on some reasoning primitives by up to $2\times$, despite the model never being trained to loop. Both approaches also adapt better than the baseline when given more in-context examples or additional supervised fine-tuning data. Additionally, depth-grown models achieve the largest reasoning gains when using higher-quality, math-heavy cooldown mixtures, which can be further boosted by adapting a middle block to loop. Overall, our results position depth growth and looping as complementary, practical methods for inducing and scaling iterative computation to improve reasoning.

从深度增长到循环：大语言模型中迭代计算的统一视角 / From Growing to Looping: A Unified View of Iterative Computation in LLMs

1️⃣ 一句话总结

这篇论文发现，让大语言模型通过‘深度增长’（训练时由浅入深）和‘循环’（推理时重复使用某些层）两种方式提升推理能力，其背后的工作机制本质上是相同的，都是通过一种迭代计算过程来实现，并且这两种方法可以相互结合，进一步放大效果。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要