FlowAdam:通过几何感知的软动量注入实现隐式正则化 / FlowAdam: Implicit Regularization via Geometry-Aware Soft Momentum Injection
1️⃣ 一句话总结
这篇论文提出了一种名为FlowAdam的新型混合优化器,它通过巧妙地结合Adam优化器和连续梯度流,在处理参数相互关联的复杂任务(如矩阵分解)时,能自动提供隐式正则化效果,从而显著提升模型性能,同时不影响在简单任务上的表现。
Adaptive moment methods such as Adam use a diagonal, coordinate-wise preconditioner based on exponential moving averages of squared gradients. This diagonal scaling is coordinate-system dependent and can struggle with dense or rotated parameter couplings, including those in matrix factorization, tensor decomposition, and graph neural networks, because it treats each parameter independently. We introduce FlowAdam, a hybrid optimizer that augments Adam with continuous gradient-flow integration via an ordinary differential equation (ODE). When EMA-based statistics detect landscape difficulty, FlowAdam switches to clipped ODE integration. Our central contribution is Soft Momentum Injection, which blends ODE velocity with Adam's momentum during mode transitions. This prevents the training collapse observed with naive hybrid approaches. Across coupled optimization benchmarks, the ODE integration provides implicit regularization, reducing held-out error by 10-22% on low-rank matrix/tensor recovery and 6% on Jester (real-world collaborative filtering), also surpassing tuned Lion and AdaBelief, while matching Adam on well-conditioned workloads (CIFAR-10). MovieLens-100K confirms benefits arise specifically from coupled parameter interactions rather than bias estimation. Ablation studies show that soft injection is essential, as hard replacement reduces accuracy from 100% to 82.5%.
FlowAdam:通过几何感知的软动量注入实现隐式正则化 / FlowAdam: Implicit Regularization via Geometry-Aware Soft Momentum Injection
这篇论文提出了一种名为FlowAdam的新型混合优化器,它通过巧妙地结合Adam优化器和连续梯度流,在处理参数相互关联的复杂任务(如矩阵分解)时,能自动提供隐式正则化效果,从而显著提升模型性能,同时不影响在简单任务上的表现。
源自 arXiv: 2604.06652