菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty

Calibration is a conditional property that depends on the information retained by a predictor. We develop decomposition identities for arbitrary proper losses that make this dependence explicit. At any information level $\mathcal A$, the expected loss of an $\mathcal A$-measurable predictor splits into a proper-regret (reliability) term and a conditional entropy (residual uncertainty) term. For nested levels $\mathcal A\subseteq\mathcal B$, a chain decomposition quantifies the information gain from $\mathcal A$ to $\mathcal B$. Applied to classification with features $\boldsymbol{X}$ and score $S=s(\boldsymbol{X})$, this yields a three-term identity: miscalibration, a {\em grouping} term measuring information loss from $\boldsymbol{X}$ to $S$, and irreducible uncertainty at the feature level. We leverage the framework to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting constructions, with explicit forms for Brier and log-loss.

顶级标签: theory model evaluation machine learning
详细标签: calibration proper losses information decomposition probabilistic forecasting uncertainty quantification 或 搜索:

分解概率分数:可靠性、信息损失与不确定性 / Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty


1️⃣ 一句话总结

这篇论文提出了一个通用框架,可以将任何概率预测模型的预测误差分解为可靠性不足、信息损失和固有不确定性三个部分,从而清晰地揭示预测性能与所利用信息量之间的关系,并应用于模型校准、集成等场景的分析。

源自 arXiv: 2603.15232