GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

📄 Abstract - GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

Adversarial robustness is essential for deploying neural networks in safety-critical applications, yet standard evaluation methods either require expensive adversarial attacks or report only a single aggregate score that obscures how robustness is distributed across classes. We introduce the \emph{GF-Score} (GREAT-Fairness Score), a framework that decomposes the certified GREAT Score into per-class robustness profiles and quantifies their disparity through four metrics grounded in welfare economics: the Robustness Disparity Index (RDI), the Normalized Robustness Gini Coefficient (NRGC), Worst-Case Class Robustness (WCR), and a Fairness-Penalized GREAT Score (FP-GREAT). The framework further eliminates the original method's dependence on adversarial attacks through a self-calibration procedure that tunes the temperature parameter using only clean accuracy correlations. Evaluating 22 models from RobustBench across CIFAR-10 and ImageNet, we find that the decomposition is exact, that per-class scores reveal consistent vulnerability patterns (e.g., ``cat'' is the weakest class in 76\% of CIFAR-10 models), and that more robust models tend to exhibit greater class-level disparity. These results establish a practical, attack-free auditing pipeline for diagnosing where certified robustness guarantees fail to protect all classes equally. We release our code on \href{this https URL}{GitHub}.

GF-Score：具有公平性保证的、经过认证的类条件鲁棒性评估框架 / GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

1️⃣ 一句话总结

这篇论文提出了一个名为GF-Score的新框架，它能够在不依赖对抗性攻击的情况下，精确评估神经网络模型在不同类别上的鲁棒性差异，并用量化指标揭示模型是否对所有类别都提供了公平的保护。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要