基于复杂度最小化的可证明元学习数据扩展规律 / Provable Data Scaling Law for Meta Learning via Complexity Minimization
1️⃣ 一句话总结
本文提出了一种名为“复杂度最小化”的元表示学习框架,通过评估各领域最合适的下游模型复杂度并最小化跨领域的最坏情况复杂度,从理论上证明了随着预训练数据量增加,小样本学习的错误率会持续下降,从而合理解释了预训练数据规模扩大带来的性能提升现象。
Pre-training has become a fundamental paradigm in modern machine learning, with one of its key empirical benefits being reduced downstream sample complexity as the scale of pre-training data increases. However, existing theoretical frameworks for pre-training do not fully explain this phenomenon. In this paper, we introduce complexity minimization, a novel meta-representation learning framework designed to enable theoretical analysis of this scaling behavior, which learns representations by evaluating the downstream model complexity best suited to each domain and minimizing the worst-case such complexity across source domains. Our end-to-end theoretical analysis, spanning pre-training through downstream regression, shows that this framework provably captures this scaling behavior; in particular, we show that the error rate of few-shot adaptation improves as the amount of meta-training data grows. Empirically, we demonstrate that incorporating complexity regularization into existing meta-learning methods consistently improves downstream sample efficiency.
基于复杂度最小化的可证明元学习数据扩展规律 / Provable Data Scaling Law for Meta Learning via Complexity Minimization
本文提出了一种名为“复杂度最小化”的元表示学习框架,通过评估各领域最合适的下游模型复杂度并最小化跨领域的最坏情况复杂度,从理论上证明了随着预训练数据量增加,小样本学习的错误率会持续下降,从而合理解释了预训练数据规模扩大带来的性能提升现象。
源自 arXiv: 2606.02008