菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-07-02
📄 Abstract - Rethinking Post-Hoc Calibration in Semantic Segmentation

Reliable confidence estimates are essential in semantic segmentation, especially in safety-critical settings where overconfident errors can mislead downstream decisions. Yet modern segmentation models often remain miscalibrated. Post-hoc calibration offers a practical way to correct confidence estimates without retraining the segmentation model, but its use in dense prediction raises structural issues that are often overlooked. We study two such issues. First, adding a constant to all logits leaves the softmax probabilities unchanged, but several standard calibrators can still depend on this arbitrary offset. As a result, two logit representations encoding the same predictive distribution may yield different calibrated probabilities. We define translation-invariant (TI) calibrators as those whose outputs are unchanged under such shifts, characterize which common calibrators satisfy this property, and construct TI counterparts of shift-sensitive calibrators to isolate the effect of removing representation dependence. Second, post-hoc calibration is typically fitted by minimizing a likelihood-based objective, whereas segmentation models are trained with task-specific metrics such as Dice. This mismatch can cause calibration to alter class orderings and degrade the deployed segmentation map. We study decision-preserving calibration under argmax- and order-preservation constraints. Since enforcing these constraints collapses affine softmax calibrators to temperature scaling, we introduce class-conditional affine calibrators that can be made argmax- or order-preserving while retaining greater expressivity, allowing us to quantify the calibration-segmentation trade-off induced by decision preservation. Across natural-image and medical segmentation benchmarks, and under corruption-based covariate shift, matched comparisons show that TI variants generally improve calibration metrics, while decision-preserving variants prevent segmentation degradation and retain strong calibration performance. These results provide practical design principles for well-defined post-hoc calibration pipelines in semantic segmentation.

顶级标签: computer vision model training model evaluation
详细标签: semantic segmentation calibration confidence estimation translation invariance decision preservation 或 搜索:

重新思考语义分割中的事后校准方法 / Rethinking Post-Hoc Calibration in Semantic Segmentation


1️⃣ 一句话总结

本文揭示了语义分割模型中使用事后校准时存在的两个关键问题:校准结果会受模型输出偏移量的影响,以及校准过程可能破坏原始分割结果(如改变类别排序);并提出了平移不变性校准器和决策保持校准器两种改进方案,在保持校准效果的同时避免对分割性能造成损害。

源自 arXiv: 2607.01902