菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-06
📄 Abstract - Mitigating Bias in Concept Bottleneck Models for Fair and Interpretable Image Classification

Ensuring fairness in image classification prevents models from perpetuating and amplifying bias. Concept bottleneck models (CBMs) map images to high-level, human-interpretable concepts before making predictions via a sparse, one-layer classifier. This structure enhances interpretability and, in theory, supports fairness by masking sensitive attribute proxies such as facial features. However, CBM concepts have been known to leak information unrelated to concept semantics and early results reveal only marginal reductions in gender bias on datasets like ImSitu. We propose three bias mitigation techniques to improve fairness in CBMs: 1. Decreasing information leakage using a top-k concept filter, 2. Removing biased concepts, and 3. Adversarial debiasing. Our results outperform prior work in terms of fairness-performance tradeoffs, indicating that our debiased CBM provides a significant step towards fair and interpretable image classification.

顶级标签: computer vision model evaluation machine learning
详细标签: fairness bias mitigation interpretability concept bottleneck models image classification 或 搜索:

缓解概念瓶颈模型中的偏见以实现公平且可解释的图像分类 / Mitigating Bias in Concept Bottleneck Models for Fair and Interpretable Image Classification


1️⃣ 一句话总结

这项研究提出了三种新的技术来减少概念瓶颈模型中的信息泄露和偏见,从而在保持模型性能的同时,显著提升了图像分类的公平性和可解释性。

源自 arXiv: 2603.05899