面向过参数化模型的随机信赖域方法 / Stochastic Trust-Region Methods for Over-parameterized Models
1️⃣ 一句话总结
本文提出了一种新的随机信赖域优化框架,能够自动调整学习步长,无需手动调参,并在过参数化模型(如深度神经网络)和带等式约束的优化问题上实现了稳定且高效的性能。
Under interpolation-type assumptions such as the strong growth condition, stochastic optimization methods can attain convergence rates comparable to full-batch methods, but their performance, particularly for SGD, remains highly sensitive to step-size selection. To address this issue, we propose a unified stochastic trust-region framework that eliminates manual step-size tuning and extends naturally to equality-constrained problems. For unconstrained optimization, we develop a first-order stochastic trust-region algorithm and show that, under the strong growth condition, it achieves an iteration and stochastic first-order oracle complexity of $O(\varepsilon^{-2} \log(1/\varepsilon))$ for finding an $\varepsilon$-stationary point. For equality-constrained problems, we introduce a quadratic-penalty-based stochastic trust-region method with penalty parameter $\mu$, and establish an iteration and oracle complexity of $O(\varepsilon^{-4} \log(1/\varepsilon))$ to reach an $\varepsilon$-stationary point of the penalized problem, corresponding to an $O(\varepsilon)$-approximate KKT point of the original constrained problem. Numerical experiments on deep neural network training and orthogonally constrained subspace fitting demonstrate that the proposed methods achieve performance comparable to well-tuned stochastic baselines, while exhibiting stable optimization behavior and effectively handling hard constraints without manual learning-rate scheduling.
面向过参数化模型的随机信赖域方法 / Stochastic Trust-Region Methods for Over-parameterized Models
本文提出了一种新的随机信赖域优化框架,能够自动调整学习步长,无需手动调参,并在过参数化模型(如深度神经网络)和带等式约束的优化问题上实现了稳定且高效的性能。
源自 arXiv: 2604.14017