从动力系统视角看深度残差网络的泛化行为 / On the Generalization Behavior of Deep Residual Networks From a Dynamical System Perspective
1️⃣ 一句话总结
这篇论文通过将深度残差网络建模为动力系统,首次为离散和连续两种形式的网络建立了统一的泛化误差理论边界,揭示了其性能随样本量增加而提升的规律,并发现网络结构本身有助于降低误差。
Deep neural networks (DNNs) have significantly advanced machine learning, with model depth playing a central role in their successes. The dynamical system modeling approach has recently emerged as a powerful framework, offering new mathematical insights into the structure and learning behavior of DNNs. In this work, we establish generalization error bounds for both discrete- and continuous-time residual networks (ResNets) by combining Rademacher complexity, flow maps of dynamical systems, and the convergence behavior of ResNets in the deep-layer limit. The resulting bounds are of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$, and include a structure-dependent negative term, yielding depth-uniform and asymptotic generalization bounds under milder assumptions. These findings provide a unified understanding of generalization across both discrete- and continuous-time ResNets, helping to close the gap in both the order of sample complexity and assumptions between the discrete- and continuous-time settings.
从动力系统视角看深度残差网络的泛化行为 / On the Generalization Behavior of Deep Residual Networks From a Dynamical System Perspective
这篇论文通过将深度残差网络建模为动力系统,首次为离散和连续两种形式的网络建立了统一的泛化误差理论边界,揭示了其性能随样本量增加而提升的规律,并发现网络结构本身有助于降低误差。
源自 arXiv: 2602.20921