BEAVER:一种高效的大语言模型确定性验证器 / BEAVER: An Efficient Deterministic LLM Verifier
1️⃣ 一句话总结
这篇论文提出了一个名为BEAVER的框架,它能够像做数学证明一样,确定性地计算出大语言模型输出满足特定要求(如正确性、隐私性)的概率上下限,从而提供传统抽样方法无法给出的可靠保证,帮助在实际应用中更精确地评估和控制模型风险。
As large language models (LLMs) transition from research prototypes to production systems, practitioners often need reliable methods to verify that model outputs satisfy required constraints. While sampling-based estimates provide an intuition of model behavior, they offer no sound guarantees. We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM constraint satisfaction. Given any prefix-closed semantic constraint, BEAVER systematically explores the generation space using novel token trie and frontier data structures, maintaining provably sound bounds at every iteration. We formalize the verification problem, prove soundness of our approach, and evaluate BEAVER on correctness verification, privacy verification and secure code generation tasks across multiple state of the art LLMs. BEAVER achieves 6 to 8 times tighter probability bounds and identifies 3 to 4 times more high risk instances compared to baseline methods under identical computational budgets, enabling precise characterization and risk assessment that loose bounds or empirical evaluation cannot provide.
BEAVER:一种高效的大语言模型确定性验证器 / BEAVER: An Efficient Deterministic LLM Verifier
这篇论文提出了一个名为BEAVER的框架,它能够像做数学证明一样,确定性地计算出大语言模型输出满足特定要求(如正确性、隐私性)的概率上下限,从而提供传统抽样方法无法给出的可靠保证,帮助在实际应用中更精确地评估和控制模型风险。
源自 arXiv: 2512.05439