菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-20
📄 Abstract - Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario

Empirical studies of trained models often report a transient regime in which signal is detectable in a finite gradient descent time window before overfitting dominates. We provide an analytically tractable random-matrix model that reproduces this phenomenon for gradient flow in a linear teacher--student setting. In this framework, learning occurs when an isolated eigenvalue separates from a noisy bulk, before eventually disappearing in the overfitting regime. The key ingredient is anisotropy in the input covariance, which induces fast and slow directions in the learning dynamics. In a two-block covariance model, we derive the full time-dependent bulk spectrum of the symmetrized weight matrix through a $2\times 2$ Dyson equation, and we obtain an explicit outlier condition for a rank-one teacher via a rank-two determinant formula. This yields a transient Baik-Ben Arous-Péché (BBP) transition: depending on signal strength and covariance anisotropy, the teacher spike may never emerge, emerge and persist, or emerge only during an intermediate time interval before being reabsorbed into the bulk. We map the corresponding phase diagrams and validate the theory against finite-size simulations. Our results provide a minimal solvable mechanism for early stopping as a transient spectral effect driven by anisotropy and noise.

顶级标签: theory machine learning
详细标签: random matrix theory gradient flow early stopping spectral analysis phase transition 或 搜索:

早期停止梯度流的随机矩阵理论:瞬态BBP情景 / Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario


1️⃣ 一句话总结

该研究通过一个可解析的随机矩阵模型,揭示了在线性教师-学生设定下,梯度下降训练中信号仅在早期阶段可检测、之后被过拟合淹没的瞬态现象,其根源是输入协方差的不均匀性导致了学习快慢方向的分离,从而产生一个随信号强度和协方差非均匀性变化的临时特征值分离窗口,为早停法的有效性提供了简洁的数学解释。

源自 arXiv: 2604.18450