菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-13
📄 Abstract - Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables

Conformal selection (CS) uses calibration data to identify test inputs whose unobserved outcomes are likely to satisfy a pre-specified minimal quality requirement, while controlling the false discovery rate (FDR). Existing methods fix the target FDR level before observing data, which prevents the user from adapting the balance between number of selected test inputs and FDR to downstream needs and constraints based on the available data. For example, in genomics or neuroimaging, researchers often inspect the distribution of test statistics, and decide how aggressively to pursue candidates based on observed evidence strength and available follow-up resources. To address this limitation, we introduce {post-hoc CS} (PH-CS), which generates a path of candidate selection sets, each paired with a data-driven false discovery proportion (FDP) estimate. PH-CS lets the user select any operating point on this path by maximizing a user-specified utility, arbitrarily balancing selection size and FDR. Building on conformal e-variables and the e-Benjamini-Hochberg (e-BH) procedure, PH-CS is proved to provide a finite-sample post-hoc reliability guarantee whereby the ratio between estimated FDP level and true FDP is, on average, upper bounded by $1$, so that the average estimated FDP is, to first order, a valid upper bound on the true FDR. PH-CS is extended to control quality defined in terms of a general risk. Experiments on synthetic and real-world datasets demonstrate that, unlike CS, PH-CS can consistently satisfy user-imposed utility constraints while producing reliable FDP estimates and maintaining competitive FDR control.

顶级标签: theory model evaluation machine learning
详细标签: conformal inference false discovery rate post-hoc analysis statistical guarantees e-variables 或 搜索:

超越固定错误发现率:基于E变量的后验合规选择方法 / Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables


1️⃣ 一句话总结

这篇论文提出了一种名为‘后验合规选择’的新方法,它允许研究人员在分析数据后,根据实际需求灵活地在发现数量和错误率之间进行权衡,而不是像传统方法那样必须预先设定一个固定的错误率控制目标。

源自 arXiv: 2604.11305