菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-19
📄 Abstract - Efficient privacy loss accounting for subsampling and random allocation

We consider the privacy amplification properties of a sampling scheme in which a user's data is used in $k$ steps chosen randomly and uniformly from a sequence (or set) of $t$ steps. This sampling scheme has been recently applied in the context of differentially private optimization (Chua et al., 2024a; Choquette-Choo et al., 2025) and communication-efficient high-dimensional private aggregation (Asi et al., 2025), where it was shown to have utility advantages over the standard Poisson sampling. Theoretical analyses of this sampling scheme (Feldman & Shenfeld, 2025; Dong et al., 2025) lead to bounds that are close to those of Poisson sampling, yet still have two significant shortcomings. First, in many practical settings, the resulting privacy parameters are not tight due to the approximation steps in the analysis. Second, the computed parameters are either the hockey stick or Renyi divergence, both of which introduce overheads when used in privacy loss accounting. In this work, we demonstrate that the privacy loss distribution (PLD) of random allocation applied to any differentially private algorithm can be computed efficiently. When applied to the Gaussian mechanism, our results demonstrate that the privacy-utility trade-off for random allocation is at least as good as that of Poisson subsampling. In particular, random allocation is better suited for training via DP-SGD. To support these computations, our work develops new tools for general privacy loss accounting based on a notion of PLD realization. This notion allows us to extend accurate privacy loss accounting to subsampling which previously required manual noise-mechanism-specific analysis.

顶级标签: theory machine learning model training
详细标签: differential privacy privacy amplification subsampling privacy loss distribution dp-sgd 或 搜索:

针对子采样与随机分配的高效隐私损失核算方法 / Efficient privacy loss accounting for subsampling and random allocation


1️⃣ 一句话总结

这篇论文提出了一种高效计算随机分配采样机制隐私损失的方法,证明其在隐私保护效果上至少不亚于传统采样方式,尤其适用于差分隐私随机梯度下降等机器学习训练场景。

源自 arXiv: 2602.17284