菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-22
📄 Abstract - Agentic Uncertainty Quantification

Although AI agents have demonstrated impressive capabilities in long-horizon reasoning, their reliability is severely hampered by the ``Spiral of Hallucination,'' where early epistemic errors propagate irreversibly. Existing methods face a dilemma: uncertainty quantification (UQ) methods typically act as passive sensors, only diagnosing risks without addressing them, while self-reflection mechanisms suffer from continuous or aimless corrections. To bridge this gap, we propose a unified Dual-Process Agentic UQ (AUQ) framework that transforms verbalized uncertainty into active, bi-directional control signals. Our architecture comprises two complementary mechanisms: System 1 (Uncertainty-Aware Memory, UAM), which implicitly propagates verbalized confidence and semantic explanations to prevent blind decision-making; and System 2 (Uncertainty-Aware Reflection, UAR), which utilizes these explanations as rational cues to trigger targeted inference-time resolution only when necessary. This enables the agent to balance efficient execution and deep deliberation dynamically. Extensive experiments on closed-loop benchmarks and open-ended deep research tasks demonstrate that our training-free approach achieves superior performance and trajectory-level calibration. We believe this principled framework AUQ represents a significant step towards reliable agents.

顶级标签: agents llm model evaluation
详细标签: uncertainty quantification hallucination mitigation agent reliability dual-process reasoning closed-loop evaluation 或 搜索:

智能体不确定性量化 / Agentic Uncertainty Quantification


1️⃣ 一句话总结

这篇论文提出了一个名为AUQ的双过程框架,通过将AI智能体自身表达的不确定性转化为控制信号,使其能动态平衡快速执行与深度思考,从而有效解决早期错误不断放大的‘幻觉螺旋’问题,显著提升了智能体在复杂任务中的可靠性和校准能力。

源自 arXiv: 2601.15703