Learning State-Tracking from Code Using Linear RNNs

📄 Abstract - Learning State-Tracking from Code Using Linear RNNs

Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures like Transformers and RNNs (linear and non-linear). However, these are often sequence-to-sequence tasks: learning to map actions (permutations) to states, which is incompatible with the next-token prediction setting commonly used to train language models. We address this gap by converting permutation composition into code via REPL traces that interleave state-reveals through prints and variable transformations. We show that linear RNNs capable of state-tracking excel also in this setting, while Transformers still fail. Motivated by this representation, we investigate why tracking states in code is generally difficult: actions are not always fully observable. We frame this as tracking the state of a probabilistic finite-state automaton with deterministic state reveals and show that linear RNNs can be worse than non-linear RNNs at tracking states in this setup.

使用线性循环神经网络从代码中学习状态追踪 / Learning State-Tracking from Code Using Linear RNNs

1️⃣ 一句话总结

这篇论文通过将状态追踪任务转化为代码执行轨迹，发现线性循环神经网络（RNN）能有效学习这种任务，而Transformer模型则表现不佳，并进一步揭示了在动作信息不完全可观察时，线性RNN的局限性。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要