Preserving Plasticity in Continual Learning via Dynamical Isometry

📄 Abstract - Preserving Plasticity in Continual Learning via Dynamical Isometry

Continual training of deep neural networks under non-stationarity often leads to a progressive loss of plasticity, eventually limiting further learning. We relate plasticity to the empirical Neural Tangent Kernel, and identify dynamical isometry (the condition that layer-wise Jacobian singular values remain close to one) as a key mechanism for preserving plasticity in continual learning. We revisit a class of networks that are almost-everywhere isometric while remaining universal Lipschitz function approximators, demonstrating that near-dynamical isometry is compatible with expressive nonlinear representations. For general architectures, we propose an efficient isometry-promoting regularization scheme and identify a novel mechanism by which it can reactivate dormant ReLU units. Building on this, we introduce AdamO, an Adam-style adaptive optimizer that decouples isometry regularization from gradient updates, analogous to AdamW. We further reinterpret prior plasticity-preserving approaches through the lens of dynamical isometry, showing that they target only a partial measure of isometry. Across supervised and reinforcement-learning continual-learning benchmarks designed to induce plasticity loss, our methods consistently match or outperform existing approaches.

通过动态等距保持持续学习中的可塑性 / Preserving Plasticity in Continual Learning via Dynamical Isometry

1️⃣ 一句话总结

这篇论文发现，在持续学习中维持神经网络的动态等距（即各层权重变换保持信号幅度稳定）能有效防止模型逐渐失去学习新任务的能力，并据此提出了一种高效的正则化方法和自适应优化器，在多个基准测试中达到了领先性能。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要