置信度校准应不止于单轮对话 / Confidence Should Be Calibrated More Than One Turn Deep
1️⃣ 一句话总结
这篇论文指出,为了确保大语言模型在多轮对话中的安全可靠应用,必须对其置信度进行动态的、基于对话历史的校准,并提出了一种新方法和解码策略来提升多轮对话中的事实准确性和一致性。
Large Language Models (LLMs) are increasingly applied in high-stakes domains such as finance, healthcare, and education, where reliable multi-turn interactions with users are essential. However, existing work on confidence estimation and calibration, a major approach to building trustworthy LLM systems, largely focuses on single-turn settings and overlooks the risks and potential of multi-turn conversations. In this work, we introduce the task of multi-turn calibration to reframe calibration from a static property into a dynamic challenge central to reliable multi-turn conversation, where calibrating model confidence at each turn conditioned on the conversation history is required. We first reveal the risks of this setting: using Expected Calibration Error at turn T (ECE@T), a new metric that tracks calibration dynamics over turns, we show that user feedback (e.g., persuasion) can degrade multi-turn calibration. To address this, we propose MTCal, which minimises ECE@T via a surrogate calibration target, and further leverage calibrated confidence in ConfChat, a decoding strategy that improves both factuality and consistency of the model response in multi-turn interactions. Extensive experiments demonstrate that MT-Cal achieves outstanding and consistent performance in multi-turn calibration, and ConfChat preserves and even enhances model performance in multi-turn interactions. Our results mark multi-turn calibration as one missing link for scaling LLM calibration toward safe, reliable, and real-world use.
置信度校准应不止于单轮对话 / Confidence Should Be Calibrated More Than One Turn Deep
这篇论文指出,为了确保大语言模型在多轮对话中的安全可靠应用,必须对其置信度进行动态的、基于对话历史的校准,并提出了一种新方法和解码策略来提升多轮对话中的事实准确性和一致性。
源自 arXiv: 2604.05397