菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-01
📄 Abstract - Deep Delta Learning

The efficacy of deep residual networks is fundamentally predicated on the identity shortcut connection. While this mechanism effectively mitigates the vanishing gradient problem, it imposes a strictly additive inductive bias on feature transformations, thereby limiting the network's capacity to model complex state transitions. In this paper, we introduce Deep Delta Learning (DDL), a novel architecture that generalizes the standard residual connection by modulating the identity shortcut with a learnable, data-dependent geometric transformation. This transformation, termed the Delta Operator, constitutes a rank-1 perturbation of the identity matrix, parameterized by a reflection direction vector $\mathbf{k}(\mathbf{X})$ and a gating scalar $\beta(\mathbf{X})$. We provide a spectral analysis of this operator, demonstrating that the gate $\beta(\mathbf{X})$ enables dynamic interpolation between identity mapping, orthogonal projection, and geometric reflection. Furthermore, we restructure the residual update as a synchronous rank-1 injection, where the gate acts as a dynamic step size governing both the erasure of old information and the writing of new features. This unification empowers the network to explicitly control the spectrum of its layer-wise transition operator, enabling the modeling of complex, non-monotonic dynamics while preserving the stable training characteristics of gated residual architectures.

顶级标签: model training theory machine learning
详细标签: neural architecture residual networks spectral analysis geometric transformation rank-1 perturbation 或 搜索:

深度增量学习 / Deep Delta Learning


1️⃣ 一句话总结

这篇论文提出了一种名为‘深度增量学习’的新网络架构,它通过一个可学习的‘增量算子’动态调整残差连接,使网络不仅能稳定训练,还能更灵活地建模复杂的特征变换,从而超越了传统残差网络只能做简单加法更新的限制。

源自 arXiv: 2601.00417