菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-26
📄 Abstract - PATE-TabTransGAN: Differentially Private Synthetic Tabular Data Generation via Transformer-Based Student Discrimination

Generating high-fidelity synthetic tabular data under formal differential privacy guarantees remains an open challenge. Methods that provide strong theoretical protection typically sacrifice the modeling of inter-feature dependencies required for realistic synthesis, while architectures that excel at capturing complex column relationships offer only empirical privacy guarantees. We present PATE-TabTransGAN, a generative framework that integrates the Private Aggregation of Teacher Ensembles (PATE) mechanism with a Transformer-based student discriminator to jointly address both requirements, and employs a GNMax RDP accountant for numerically stable privacy accounting. An ensemble of Logistic Regression teachers trained on disjoint partitions supervise the student via noisy-aggregated labels, and a residual generator is optimized against this differentially private student, inheriting formal ({\epsilon}, {\delta})-DP guarantees by post-processing. PATE-TabTransGAN was compared with PATE-GAN, DP-GAN, and DP-CTGAN, considered state-of-the-art in differentially private tabular synthesis. Experiments conducted on four tabular benchmarks (Adult, Breast, Cardio, Cervical) confirmed the high quality of the proposed method: PATE-TabTransGAN attains the best or tied-best AUROC on all four datasets. On AUCPR it matches the strongest baseline on Cardio, leads on Cervical, and trails on Breast; on Adult, we demonstrate that AUCPR is highly sensitive to positive-class convention, and that the observed gap is consistent with a convention difference between evaluation pipelines rather than a synthesis deficit.

顶级标签: machine learning data
详细标签: differential privacy synthetic data tabular data transformer generative adversarial network 或 搜索:

PATE-TabTransGAN:基于Transformer学生判别器的差分隐私合成表格数据生成 / PATE-TabTransGAN: Differentially Private Synthetic Tabular Data Generation via Transformer-Based Student Discrimination


1️⃣ 一句话总结

本文提出了一种名为PATE-TabTransGAN的框架,通过将教师集成隐私聚合(PATE)机制与基于Transformer的学生判别器相结合,在严格保护数据隐私的同时,能高质量地生成保留真实数据中复杂特征关系的合成表格数据,实验证明其性能优于现有的差分隐私表格数据生成方法。

源自 arXiv: 2605.26802