带任意数据增强的随机特征回归的泛化误差刻画 / Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation
1️⃣ 一句话总结
本文分析了在监督回归任务中,数据增强如何起到正则化作用,并给出了测试误差的精确数学表达式,该表达式只依赖于真实数据的整体分布以及数据增强方式的一阶和二阶统计量,适用于特征映射不准确和仅训练最后一层网络的情况。
This paper aims at analyzing the regularization effect that data augmentation induces on supervised regression methods in the proportional regime, where the number of covariates grows proportionally to the number of samples. We provide a tight characterization of the test error, measured in mean squared error, in terms only of the population quantities of the true data, as well as first and second order statistics of the augmentation scheme. Our results are valid under misspecified feature maps, and for any network architecture where only the last readout layer is trained, and the rest of the network is either frozen or randomly initialized. We specify our results in the case of Gaussian data, and show that our asymptotic characterization is tight in this setting.
带任意数据增强的随机特征回归的泛化误差刻画 / Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation
本文分析了在监督回归任务中,数据增强如何起到正则化作用,并给出了测试误差的精确数学表达式,该表达式只依赖于真实数据的整体分布以及数据增强方式的一阶和二阶统计量,适用于特征映射不准确和仅训练最后一层网络的情况。
源自 arXiv: 2605.10290