不变性学习的几何:数据增强与泛化的信息论分析 / The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization
1️⃣ 一句话总结
这篇论文通过一个信息论框架,揭示了数据增强如何通过一个称为‘群直径’的几何概念来影响模型的泛化能力,并量化了数据保真度与正则化效果之间的内在权衡。
Data augmentation is one of the most widely used techniques to improve generalization in modern machine learning, often justified by its ability to promote invariance to label-irrelevant transformations. However, its theoretical role remains only partially understood. In this work, we propose an information-theoretic framework that systematically accounts for the effect of augmentation on generalization and invariance learning. Our approach builds upon mutual information-based bounds, which relate the generalization gap to the amount of information a learning algorithm retains about its training data. We extend this framework by modeling the augmented distribution as a composition of the original data distribution with a distribution over transformations, which naturally induces an orbit-averaged loss function. Under mild sub-Gaussian assumptions on the loss function and the augmentation process, we derive a new generalization bound that decompose the expected generalization gap into three interpretable terms: (1) a distributional divergence between the original and augmented data, (2) a stability term measuring the algorithm dependence on training data, and (3) a sensitivity term capturing the effect of augmentation variability. To connect our bounds to the geometry of the augmentation group, we introduce the notion of group diameter, defined as the maximal perturbation that augmentations can induce in the input space. The group diameter provides a unified control parameter that bounds all three terms and highlights an intrinsic trade-off: small diameters preserve data fidelity but offer limited regularization, while large diameters enhance stability at the cost of increased bias and sensitivity. We validate our theoretical bounds with numerical experiments, demonstrating that it reliably tracks and predicts the behavior of the true generalization gap.
不变性学习的几何:数据增强与泛化的信息论分析 / The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization
这篇论文通过一个信息论框架,揭示了数据增强如何通过一个称为‘群直径’的几何概念来影响模型的泛化能力,并量化了数据保真度与正则化效果之间的内在权衡。
源自 arXiv: 2602.14423