菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-04
📄 Abstract - Provable Target Sample Complexity Improvements as Pre-Trained Models Scale

Pre-trained models have become indispensable for efficiently building models across a broad spectrum of downstream tasks. The advantages of pre-trained models have been highlighted by empirical studies on scaling laws, which demonstrate that larger pre-trained models can significantly reduce the sample complexity of downstream learning. However, existing theoretical investigations of pre-trained models lack the capability to explain this phenomenon. In this paper, we provide a theoretical investigation by introducing a novel framework, caulking, inspired by parameter-efficient fine-tuning (PEFT) methods such as adapter-based fine-tuning, low-rank adaptation, and partial fine-tuning. Our analysis establishes that improved pre-trained models provably decrease the sample complexity of downstream tasks, thereby offering theoretical justification for the empirically observed scaling laws relating pre-trained model size to downstream performance, a relationship not covered by existing results.

顶级标签: machine learning model training theory
详细标签: pre-trained models sample complexity scaling laws theoretical analysis fine-tuning 或 搜索:

预训练模型规模扩展可证明降低下游任务样本复杂度 / Provable Target Sample Complexity Improvements as Pre-Trained Models Scale


1️⃣ 一句话总结

这篇论文通过一个名为‘填隙’的新理论框架,首次从理论上证明了更大的预训练模型确实能降低下游任务的学习所需数据量,为实践中观察到的‘模型越大,下游性能越好’的规律提供了坚实的数学解释。

源自 arXiv: 2602.04233