菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-05
📄 Abstract - Confidence Estimation for LLMs in Multi-turn Interactions

While confidence estimation is a promising direction for mitigating hallucinations in Large Language Models (LLMs), current research dominantly focuses on single-turn settings. The dynamics of model confidence in multi-turn conversations, where context accumulates and ambiguity is progressively resolved, remain largely unexplored. Reliable confidence estimation in multi-turn settings is critical for many downstream applications, such as autonomous agents and human-in-the-loop systems. This work presents the first systematic study of confidence estimation in multi-turn interactions, establishing a formal evaluation framework grounded in two key desiderata: per-turn calibration and monotonicity of confidence as more information becomes available. To facilitate this, we introduce novel metrics, including a length-normalized Expected Calibration Error (InfoECE), and a new "Hinter-Guesser" paradigm for generating controlled evaluation datasets. Our experiments reveal that widely-used confidence techniques struggle with calibration and monotonicity in multi-turn dialogues. We propose P(Sufficient), a logit-based probe that achieves comparatively better performance, although the task remains far from solved. Our work provides a foundational methodology for developing more reliable and trustworthy conversational agents.

顶级标签: llm model evaluation agents
详细标签: confidence estimation multi-turn dialogue calibration hallucination mitigation evaluation framework 或 搜索:

大语言模型在多轮对话中的置信度估计 / Confidence Estimation for LLMs in Multi-turn Interactions


1️⃣ 一句话总结

这篇论文首次系统性地研究了大语言模型在多轮对话中的置信度估计问题,发现现有方法效果不佳,并提出了一种新的评估框架和一个表现相对更好的探测方法,为构建更可靠的对话智能体奠定了基础。

源自 arXiv: 2601.02179