← 返回列表

arXiv 提交日期: 2026-02-23

📄 Abstract - Path-conditioned training: a principled way to rescale ReLU neural networks

Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.

顶级标签: machine learning model training theory

路径条件训练：一种重新缩放ReLU神经网络参数的原则性方法 / Path-conditioned training: a principled way to rescale ReLU neural networks

1️⃣ 一句话总结

这篇论文提出了一种基于几何原理的新方法，通过优化调整ReLU神经网络参数的缩放比例来改善训练动态，从而有效加速模型训练过程。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2602.19799

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要