mHC:流形约束的超连接 / mHC: Manifold-Constrained Hyper-Connections
1️⃣ 一句话总结
这篇论文提出了一种名为mHC的新框架,它通过将神经网络中复杂的‘超连接’结构约束在特定几何空间上,既保留了其提升性能的优点,又解决了由此带来的训练不稳定和难以扩展的问题,为设计更强大、更稳定的AI模型提供了新思路。
Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifying connectivity patterns. While yielding substantial performance gains, this diversification fundamentally compromises the identity mapping property intrinsic to the residual connection, which causes severe training instability and restricted scalability, and additionally incurs notable memory access overhead. To address these challenges, we propose Manifold-Constrained Hyper-Connections (mHC), a general framework that projects the residual connection space of HC onto a specific manifold to restore the identity mapping property, while incorporating rigorous infrastructure optimization to ensure efficiency. Empirical experiments demonstrate that mHC is effective for training at scale, offering tangible performance improvements and superior scalability. We anticipate that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and suggest promising directions for the evolution of foundational models.
mHC:流形约束的超连接 / mHC: Manifold-Constrained Hyper-Connections
这篇论文提出了一种名为mHC的新框架,它通过将神经网络中复杂的‘超连接’结构约束在特定几何空间上,既保留了其提升性能的优点,又解决了由此带来的训练不稳定和难以扩展的问题,为设计更强大、更稳定的AI模型提供了新思路。
源自 arXiv: 2512.24880