特征归因解释中的缺失性偏差校准 / Missingness Bias Calibration in Feature Attribution Explanations
1️⃣ 一句话总结
这篇论文提出了一种名为MCal的轻量级后处理方法,它通过微调一个简单的线性头来校准模型解释中普遍存在的缺失性偏差,无需重新训练或修改模型结构,就能在多种医学任务上有效提升特征重要性评分的可靠性。
Popular explanation methods often produce unreliable feature importance scores due to missingness bias, a systematic distortion that arises when models are probed with ablated, out-of-distribution inputs. Existing solutions treat this as a deep representational flaw that requires expensive retraining or architectural modifications. In this work, we challenge this assumption and show that missingness bias can be effectively treated as a superficial artifact of the model's output space. We introduce MCal, a lightweight post-hoc method that corrects this bias by fine-tuning a simple linear head on the outputs of a frozen base model. Surprisingly, we find this simple correction consistently reduces missingness bias and is competitive with, or even outperforms, prior heavyweight approaches across diverse medical benchmarks spanning vision, language, and tabular domains.
特征归因解释中的缺失性偏差校准 / Missingness Bias Calibration in Feature Attribution Explanations
这篇论文提出了一种名为MCal的轻量级后处理方法,它通过微调一个简单的线性头来校准模型解释中普遍存在的缺失性偏差,无需重新训练或修改模型结构,就能在多种医学任务上有效提升特征重要性评分的可靠性。
源自 arXiv: 2603.04831