菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-28
📄 Abstract - Self-Organized Conformal Prediction: Reducing Regional Coverage Gaps with Unsupervised Group Discovery

Conformal prediction guarantees marginal coverage, but pooled calibration averages over heterogeneous regions and can mask regional undercoverage in safety-critical subgroups. We introduce Self-Organized Conformal Prediction (SOCP), a calibration scheme that discovers input-space groups with a Self-Organizing Map (SOM) and, at test time, draws a local calibration buffer from the query's best-matching unit (BMU) cell or a fixed grid neighborhood. The same retrieval rule applies to regression and classification tasks across tabular features and image embeddings, leaving the predictor and nonconformity score untouched. SOCP gives exact validity for BMU-cell retrieval and fixed retrieved-set validity for neighborhood buffers; central-cell validity for neighborhood retrieval holds up to a Kolmogorov-Smirnov (KS) bias term. A split-routed extension recovers fixed retrieved-set validity conditional on the routing split. On eight regression and classification benchmarks, SO-SCP reduces the weighted regional coverage gap on $7/8$ datasets (mean paired change $-7.1\%$) for a mean prediction-set size increase of $6.2\%$, with negligible overhead on the largest six datasets; SO-CQR yields smaller gains, since quantile regression already absorbs much of the heterogeneity. By learning groups directly from the input geometry, SOCP provides group-local calibration with exact fixed-group guarantees and approximate central-cell guarantees, without supervised partitions or predictor retraining.

顶级标签: machine learning model training model evaluation
详细标签: conformal prediction calibration uncertainty quantification self-organizing map regional coverage 或 搜索:

自组织共形预测:通过无监督群体发现减少区域覆盖缺口 / Self-Organized Conformal Prediction: Reducing Regional Coverage Gaps with Unsupervised Group Discovery


1️⃣ 一句话总结

本文提出一种名为自组织共形预测(SOCP)的新方法,通过自组织映射(SOM)算法自动将输入数据划分为不同区域,并为每个区域独立调整预测置信区间,从而显著改善传统方法在关键子群体上覆盖率不足的问题,仅以很小的预测区间增量为代价,在多个回归和分类任务上有效降低区域覆盖缺口。

源自 arXiv: 2606.29403