Sequential Group Composition: A Window into the Mechanics of Deep Learning

📄 Abstract - Sequential Group Composition: A Window into the Mechanics of Deep Learning

How do neural networks trained over sequences acquire the ability to perform structured operations, such as arithmetic, geometric, and algorithmic computation? To gain insight into this question, we introduce the sequential group composition task. In this task, networks receive a sequence of elements from a finite group encoded in a real vector space and must predict their cumulative product. The task can be order-sensitive and requires a nonlinear architecture to be learned. Our analysis isolates the roles of the group structure, encoding statistics, and sequence length in shaping learning. We prove that two-layer networks learn this task one irreducible representation of the group at a time in an order determined by the Fourier statistics of the encoding. These networks can perfectly learn the task, but doing so requires a hidden width exponential in the sequence length $k$. In contrast, we show how deeper models exploit the associativity of the task to dramatically improve this scaling: recurrent neural networks compose elements sequentially in $k$ steps, while multilayer networks compose adjacent pairs in parallel in $\log k$ layers. Overall, the sequential group composition task offers a tractable window into the mechanics of deep learning.

序列群组合：窥探深度学习机制的一扇窗 / Sequential Group Composition: A Window into the Mechanics of Deep Learning

1️⃣ 一句话总结

这篇论文通过设计一个名为‘序列群组合’的数学任务，揭示了不同深度神经网络（如浅层网络、循环网络和多层网络）在处理序列数据时，如何利用群的结构和运算的关联性来高效学习，从而为理解深度学习的内部工作机制提供了一个可分析的理论模型。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要