范数约束下多类可分数据的隐式偏差研究 / Towards The Implicit Bias on Multiclass Separable Data Under Norm Constraints
1️⃣ 一句话总结
这篇论文通过提出一种名为NucGD的新型优化器,揭示了在训练多分类模型时,优化算法的几何特性如何引导模型找到具有低秩结构的解,从而提升泛化能力。
Implicit bias induced by gradient-based algorithms is essential to the generalization of overparameterized models, yet its mechanisms can be subtle. This work leverages the Normalized Steepest Descent} (NSD) framework to investigate how optimization geometry shapes solutions on multiclass separable data. We introduce NucGD, a geometry-aware optimizer designed to enforce low rank structures through nuclear norm constraints. Beyond the algorithm itself, we connect NucGD with emerging low-rank projection methods, providing a unified perspective. To enable scalable training, we derive an efficient SVD-free update rule via asynchronous power iteration. Furthermore, we empirically dissect the impact of stochastic optimization dynamics, characterizing how varying levels of gradient noise induced by mini-batch sampling and momentum modulate the convergence toward the expected maximum margin this http URL code is accessible at: this https URL.
范数约束下多类可分数据的隐式偏差研究 / Towards The Implicit Bias on Multiclass Separable Data Under Norm Constraints
这篇论文通过提出一种名为NucGD的新型优化器,揭示了在训练多分类模型时,优化算法的几何特性如何引导模型找到具有低秩结构的解,从而提升泛化能力。
源自 arXiv: 2603.22824