指数族中基于差分隐私充分统计量的噪声校准推断 / Noise-Calibrated Inference from Differentially Private Sufficient Statistics in Exponential Families
1️⃣ 一句话总结
这篇论文提出了一种在保护数据隐私的同时进行可靠统计推断的新方法:先发布经过差分隐私处理的“充分统计量”,然后利用这些带有噪声的数据进行校准,从而生成可信的统计结论或合成数据。
Many differentially private (DP) data release systems either output DP synthetic data and leave analysts to perform inference as usual, which can lead to severe miscalibration, or output a DP point estimate without a principled way to do uncertainty quantification. This paper develops a clean and tractable middle ground for exponential families: release only DP sufficient statistics, then perform noise-calibrated likelihood-based inference and optional parametric synthetic data generation as post-processing. Our contributions are: (1) a general recipe for approximate-DP release of clipped sufficient statistics under the Gaussian mechanism; (2) asymptotic normality, explicit variance inflation, and valid Wald-style confidence intervals for the plug-in DP MLE; (3) a noise-aware likelihood correction that is first-order equivalent to the plug-in but supports bootstrap-based intervals; and (4) a matching minimax lower bound showing the privacy distortion rate is unavoidable. The resulting theory yields concrete design rules and a practical pipeline for releasing DP synthetic data with principled uncertainty quantification, validated on three exponential families and real census data.
指数族中基于差分隐私充分统计量的噪声校准推断 / Noise-Calibrated Inference from Differentially Private Sufficient Statistics in Exponential Families
这篇论文提出了一种在保护数据隐私的同时进行可靠统计推断的新方法:先发布经过差分隐私处理的“充分统计量”,然后利用这些带有噪声的数据进行校准,从而生成可信的统计结论或合成数据。
源自 arXiv: 2603.02010