菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-28
📄 Abstract - Multiple-Prediction-Powered Inference

Statistical estimation often involves tradeoffs between expensive, high-quality measurements and a variety of lower-quality proxies. We introduce Multiple-Prediction-Powered Inference (MultiPPI): a general framework for constructing statistically efficient estimates by optimally allocating resources across these diverse data sources. This work provides theoretical guarantees about the minimax optimality, finite-sample performance, and asymptotic normality of the MultiPPI estimator. Through experiments across three diverse large language model (LLM) evaluation scenarios, we show that MultiPPI consistently achieves lower estimation error than existing baselines. This advantage stems from its budget-adaptive allocation strategy, which strategically combines subsets of models by learning their complex cost and correlation structures.

顶级标签: model evaluation machine learning theory
详细标签: statistical inference budget allocation llm evaluation minimax optimality data sources 或 搜索:

多重预测驱动的统计推断 / Multiple-Prediction-Powered Inference


1️⃣ 一句话总结

这篇论文提出了一个名为MultiPPI的新框架,它能够智能地组合昂贵但准确的数据和廉价但粗糙的预测模型,在给定预算下实现更精确的统计估计,并在大语言模型评估等实际场景中证明了其优越性。

源自 arXiv: 2603.27414