随机优化中的步长稳定性:一个理论视角 / Step-Size Stability in Stochastic Optimization: A Theoretical Perspective
1️⃣ 一句话总结
这篇论文从理论上分析了随机优化方法对步长参数的敏感度,发现了一个关键指标可以量化步长过大时性能的下降程度,并首次提供了直接的理论证据,证明自适应步长方法比传统随机梯度下降法更稳定、更鲁棒。
We present a theoretical analysis of stochastic optimization methods in terms of their sensitivity with respect to the step size. We identify a key quantity that, for each method, describes how the performance degrades as the step size becomes too large. For convex problems, we show that this quantity directly impacts the suboptimality bound of the method. Most importantly, our analysis provides direct theoretical evidence that adaptive step-size methods, such as SPS or NGN, are more robust than SGD. This allows us to quantify the advantage of these adaptive methods beyond empirical evaluation. Finally, we show through experiments that our theoretical bound qualitatively mirrors the actual performance as a function of the step size, even for nonconvex problems.
随机优化中的步长稳定性:一个理论视角 / Step-Size Stability in Stochastic Optimization: A Theoretical Perspective
这篇论文从理论上分析了随机优化方法对步长参数的敏感度,发现了一个关键指标可以量化步长过大时性能的下降程度,并首次提供了直接的理论证据,证明自适应步长方法比传统随机梯度下降法更稳定、更鲁棒。
源自 arXiv: 2602.09842