菜单

🤖 系统
📄 Abstract - Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Solving complex tasks usually requires LLMs to generate long multi-step reasoning chains. Previous work has shown that verifying the correctness of individual reasoning steps can further improve the performance and efficiency of LLMs on such tasks and enhance solution interpretability. However, existing verification approaches, such as Process Reward Models (PRMs), are either computationally expensive, limited to specific domains, or require large-scale human or model-generated annotations. Thus, we propose a lightweight alternative for step-level reasoning verification based on data-driven uncertainty scores. We train transformer-based uncertainty quantification heads (UHeads) that use the internal states of a frozen LLM to estimate the uncertainty of its reasoning steps during generation. The approach is fully automatic: target labels are generated either by another larger LLM (e.g., DeepSeek R1) or in a self-supervised manner by the original model itself. UHeads are both effective and lightweight, containing less than 10M parameters. Across multiple domains, including mathematics, planning, and general knowledge question answering, they match or even surpass the performance of PRMs that are up to 810x larger. Our findings suggest that the internal states of LLMs encode their uncertainty and can serve as reliable signals for reasoning verification, offering a promising direction toward scalable and generalizable introspective LLMs.

顶级标签: llm model evaluation natural language processing
详细标签: reasoning verification uncertainty quantification step-level evaluation internal states efficient verification 或 搜索:

📄 论文总结

基于置信度的推理:通过不确定性头高效验证大语言模型的推理步骤 / Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads


1️⃣ 一句话总结

这项研究提出了一种轻量级方法,通过训练小型不确定性评估模块来利用大语言模型内部状态自动验证其推理步骤的正确性,在多个领域实现了与庞大模型相当甚至更优的验证效果,同时显著降低了计算成本。


📄 打开原文 PDF