菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-09
📄 Abstract - Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning

Large language models (LLMs) achieve strong reasoning performance through chain-of-thought (CoT) reasoning, yet often generate unnecessarily long reasoning paths that incur high inference cost. Recent self-consistency-based approaches further improve accuracy but require sampling and aggregating multiple reasoning trajectories, leading to substantial additional computational overhead. This paper introduces a confidence-aware decision framework that analyzes a single completed reasoning trajectory to adaptively select between single-path and multi-path reasoning. The framework is trained using sentence-level numeric and linguistic features extracted from intermediate reasoning states in the MedQA dataset and generalizes effectively to MathQA, MedMCQA, and MMLU without additional fine-tuning. Experimental results show that the proposed method maintains accuracy comparable to multi-path baselines while using up to 80\% fewer tokens. These findings demonstrate that reasoning trajectories contain rich signals for uncertainty estimation, enabling a simple, transferable mechanism to balance accuracy and efficiency in LLM reasoning.

顶级标签: llm model evaluation natural language processing
详细标签: chain-of-thought self-consistency efficient inference uncertainty estimation adaptive sampling 或 搜索:

学习何时采样:用于高效大语言模型思维链推理的置信度感知自一致性方法 / Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning


1️⃣ 一句话总结

这篇论文提出了一种智能决策框架,通过分析大语言模型单次推理过程中的内部信号,自动判断何时需要额外采样多条推理路径来保证准确性,从而在基本不损失精度的前提下,大幅减少了计算开销。

源自 arXiv: 2603.08999