菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-12
📄 Abstract - Are LLM Decisions Faithful to Verbal Confidence?

Large Language Models (LLMs) can produce surprisingly sophisticated estimates of their own uncertainty. However, it remains unclear to what extent this expressed confidence is tied to the reasoning, knowledge, or decision making of the model. To test this, we introduce $\textbf{RiskEval}$: a framework designed to evaluate whether models adjust their abstention policies in response to varying error penalties. Our evaluation of several frontier models reveals a critical dissociation: models are neither cost-aware when articulating their verbal confidence, nor strategically responsive when deciding whether to engage or abstain under high-penalty conditions. Even when extreme penalties render frequent abstention the mathematically optimal strategy, models almost never abstain, resulting in utility collapse. This indicates that calibrated verbal confidence scores may not be sufficient to create trustworthy and interpretable AI systems, as current models lack the strategic agency to convert uncertainty signals into optimal and risk-sensitive decisions.

顶级标签: llm model evaluation theory
详细标签: uncertainty quantification risk sensitivity abstention behavior confidence calibration decision making 或 搜索:

大语言模型的决策是否忠实于其口头表达的置信度? / Are LLM Decisions Faithful to Verbal Confidence?


1️⃣ 一句话总结

这篇论文通过一个名为RiskEval的评估框架发现,当前的大语言模型虽然能表达出看似合理的自我不确定性,但其口头上的置信度与实际决策行为脱节,即使在面临高错误惩罚时也不会明智地选择放弃回答,导致其可信度和实用性大打折扣。

源自 arXiv: 2601.07767