菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-08
📄 Abstract - Data augmented bootstrap: Unifying confidence interval construction by approximate invariance

We propose the data augmented bootstrap (DAB), a framework for constructing confidence intervals from approximately invariant transformations of the data. As special cases, DAB recovers popular methods that rely on exact group symmetries, such as conformal prediction, wild bootstrap for Maximum Mean Discrepancy U-statistics and the recently proposed SymmPI. Meanwhile, DAB also recovers the classical bootstrap method, which exploits the dataset's approximate invariance under uniform sampling of data indices as the dataset size grows. For all DAB methods, we establish theoretical coverage results that interpolate between finite-sample and asymptotic guarantees according to the strength of the invariance, and without assuming a group structure. The approximate invariance is measured in the Kolmogorov distance and, for statistics that satisfy Gaussian universality, reduces to conditional mean and variance matching. This allows us to incorporate data augmentation (DA), a widely used machine learning heuristic based on approximate invariances, into known statistical methods. We empirically test the performance of incorporating DA into bootstrap, wild bootstrap and conformal prediction for simulated settings as well as for image, language and scientific data.

顶级标签: machine learning data
详细标签: confidence interval bootstrap data augmentation approximate invariance conformal prediction 或 搜索:

数据增强自助法:通过近似不变性统一置信区间构建 / Data augmented bootstrap: Unifying confidence interval construction by approximate invariance


1️⃣ 一句话总结

本文提出一种名为“数据增强自助法”的统一框架,能将多种现有统计方法(如自助法、共形预测等)和机器学习中常用的数据增强技术联系起来,通过利用数据变换的“近似不变性”来构建置信区间,并在理论上保证了从有限样本到大样本的覆盖效果。

源自 arXiv: 2606.09049