菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-27
📄 Abstract - Semi-Supervised Hypothesis Testing by Betting on Predictions

We introduce a testing-by-betting framework that leverages predictions on unlabeled data to enhance the power of sequential hypothesis testing. Given limited samples from the joint distribution of $(X,Y)$, and additional unlabeled samples from the marginal of $X$, we ask how unlabeled data can be used to hypothesize about the distribution of $Y$, and the conditional distribution of $Y\mid X$. We introduce an e-statistic and use it to construct a sequential test. Under standard distributional assumptions -- label shift or concept shift -- we establish that the test is anytime valid. Furthermore, we show that for binary data, the e-statistic has non-trivial power. Crucially, our approach retains these properties even when the underlying predictions are inaccurate. Through simulations and applications to large language models evaluation, we demonstrate power gains over baseline approaches, including prediction-powered inference. These gains persist even with relatively limited unlabeled data and when predictions have low accuracy due to weak correlation between $X$ and $Y$.

顶级标签: machine learning theory model evaluation
详细标签: sequential hypothesis testing e-statistic semi-supervised label shift prediction-powered inference 或 搜索:

基于预测下注的半监督假设检验 / Semi-Supervised Hypothesis Testing by Betting on Predictions


1️⃣ 一句话总结

本文提出一种新的统计检验方法,通过利用大量无标签数据中的预测信息来提升假设检验的效率,即使预测不准确也能保证结果可靠,并在语言模型评估等实际任务中表现出优于传统方法的性能。

源自 arXiv: 2605.28533