加速动态重要性加权:基于多样性散度最小化估计器的统一框架 / Accelerated Dynamic Importance Weighting with Versatile Divergence-Minimizing Estimators
1️⃣ 一句话总结
本文提出了一种加速动态重要性加权方法,通过轻量级的无梯度更新和统一的散度最小化框架,大幅提升了深度学习在训练与测试数据分布不一致时的计算效率与权重估计灵活性,并在多个数据集上取得了更优性能。
Importance weighting (IW) is a golden solver for joint distribution shift, where the joint distributions differ between the training and test data. To solve this problem, IW estimates test-to-training density ratios as importance weights and reweights the training losses accordingly. Recent advances in dynamic IW (DIW) integrate weight estimation into model training, enabling scalable IW for deep models and achieving strong performance on large modern datasets. Despite its promise, DIW remains limited in two aspects. First, it incurs substantial computational overhead by solving a kernel mean matching (KMM)-induced optimization problem to convergence in every mini-batch. Second, it relies solely on KMM for weight estimation, whereas the IW literature contains diverse estimation methods based on different divergence measures. In this paper, we propose accelerated DIW (ADIW), a unified and efficient IW framework for deep learning under joint distribution shift. ADIW performs a few lightweight projected gradient descent updates that warm-start from previously updated weights, substantially improving efficiency. Moreover, ADIW generalizes DIW into a unified divergence-minimization framework that supports diverse weight-estimation methods in a plug-and-play manner, including those based on the Kullback-Leibler divergence, squared distance, and Wasserstein-1 distance. We establish convergence guarantees for ADIW under mild conditions, and empirical results demonstrate that ADIW achieves state-of-the-art IW performance while being substantially more efficient.
加速动态重要性加权:基于多样性散度最小化估计器的统一框架 / Accelerated Dynamic Importance Weighting with Versatile Divergence-Minimizing Estimators
本文提出了一种加速动态重要性加权方法,通过轻量级的无梯度更新和统一的散度最小化框架,大幅提升了深度学习在训练与测试数据分布不一致时的计算效率与权重估计灵活性,并在多个数据集上取得了更优性能。
源自 arXiv: 2605.25499