一阶双层极小极大优化的稳定性与泛化性研究 / On the Stability and Generalization of First-order Bilevel Minimax Optimization
1️⃣ 一句话总结
这篇论文首次系统性地分析了一类用于双层极小极大优化问题的梯度求解算法的泛化能力,通过理论推导和实验验证,揭示了算法稳定性、泛化误差与实际设置之间的权衡关系。
Bilevel optimization and bilevel minimax optimization have recently emerged as unifying frameworks for a range of machine-learning tasks, including hyperparameter optimization and reinforcement learning. The existing literature focuses on empirical efficiency and convergence guarantees, leaving a critical theoretical gap in understanding how well these algorithms generalize. To bridge this gap, we provide the first systematic generalization analysis for first-order gradient-based bilevel minimax solvers with lower-level minimax problems. Specifically, by leveraging algorithmic stability arguments, we derive fine-grained generalization bounds for three representative algorithms, including single-timescale stochastic gradient descent-ascent, and two variants of two-timescale stochastic gradient descent-ascent. Our results reveal a precise trade-off among algorithmic stability, generalization gaps, and practical settings. Furthermore, extensive empirical evaluations corroborate our theoretical insights on realistic optimization tasks with bilevel minimax structures.
一阶双层极小极大优化的稳定性与泛化性研究 / On the Stability and Generalization of First-order Bilevel Minimax Optimization
这篇论文首次系统性地分析了一类用于双层极小极大优化问题的梯度求解算法的泛化能力,通过理论推导和实验验证,揭示了算法稳定性、泛化误差与实际设置之间的权衡关系。
源自 arXiv: 2604.20115