rSDNet:一种统一对抗标签噪声和对抗攻击的鲁棒神经网络学习方法 / rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks
1️⃣ 一句话总结
这篇论文提出了一种名为rSDNet的统一鲁棒学习方法,它通过一种基于S-散度的统计估计框架,让神经网络在训练时能同时抵御数据标签错误和输入对抗攻击,从而在保持正常数据上准确率的同时,显著提升了模型在受污染数据上的可靠性。
Neural networks are central to modern artificial intelligence, yet their training remains highly sensitive to data contamination. Standard neural classifiers are trained by minimizing the categorical cross-entropy loss, corresponding to maximum likelihood estimation under a multinomial model. While statistically efficient under ideal conditions, this approach is highly vulnerable to contaminated observations including label noises corrupting supervision in the output space, and adversarial perturbations inducing worst-case deviations in the input space. In this paper, we propose a unified and statistically grounded framework for robust neural classification that addresses both forms of contamination within a single learning objective. We formulate neural network training as a minimum-divergence estimation problem and introduce rSDNet, a robust learning algorithm based on the general class of $S$-divergences. The resulting training objective inherits robustness properties from classical statistical estimation, automatically down-weighting aberrant observations through model probabilities. We establish essential population-level properties of rSDNet, including Fisher consistency, classification calibration implying Bayes optimality, and robustness guarantees under uniform label noise and infinitesimal feature contamination. Experiments on three benchmark image classification datasets show that rSDNet improves robustness to label corruption and adversarial attacks while maintaining competitive accuracy on clean data, Our results highlight minimum-divergence learning as a principled and effective framework for robust neural classification under heterogeneous data contamination.
rSDNet:一种统一对抗标签噪声和对抗攻击的鲁棒神经网络学习方法 / rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks
这篇论文提出了一种名为rSDNet的统一鲁棒学习方法,它通过一种基于S-散度的统计估计框架,让神经网络在训练时能同时抵御数据标签错误和输入对抗攻击,从而在保持正常数据上准确率的同时,显著提升了模型在受污染数据上的可靠性。
源自 arXiv: 2603.17628