📄
Abstract - AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
Frozen-backbone transfer with Vision Transformers faces two under-addressed issues: optimization instability when adapters are naively inserted into a fixed feature extractor, and the absence of principled guidance for setting adapter capacity. We introduce AdapterTune, which augments each transformer block with a residual low-rank bottleneck whose up-projection is zero-initialized, guaranteeing that the adapted network starts exactly at the pretrained function and eliminates early-epoch representation drift. On the analytical side, we formalize adapter rank as a capacity budget for approximating downstream task shifts in feature space. The resulting excess-risk decomposition predicts monotonic but diminishing accuracy gains with increasing rank, an ``elbow'' behavior we confirm through controlled sweeps. We evaluate on 9 datasets and 3 backbone scales with multi-seed reporting throughout. On a core 5 dataset transfer suite, AdapterTune improves top-1 accuracy over head-only transfer by +14.9 points on average while training only 0.92 of the parameters required by full fine-tuning, and outperforms full fine-tuning on 10 of 15 dataset-backbone pairs. Across the full benchmark, AdapterTune improves over head-only transfer on every dataset-backbone pair tested. Ablations on rank, placement, and initialization isolate each design choice. The code is available at: this https URL
AdapterTune:用于冻结视觉Transformer的零初始化低秩适配器 /
AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
1️⃣ 一句话总结
这篇论文提出了一种名为AdapterTune的新方法,通过向冻结的视觉Transformer模型添加零初始化的低秩适配器,解决了微调过程中的优化不稳定和参数容量设置难题,在显著减少训练参数的同时,取得了比仅训练分类头或全模型微调更好的图像分类性能。