菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-07
📄 Abstract - Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration

As agent-based systems continue to evolve, deep research agents are capable of automatically generating research-style reports across diverse domains. While these agents promise to streamline information synthesis and knowledge exploration, existing evaluation frameworks-typically based on subjective dimensions-fail to capture a critical aspect of report quality: trustworthiness. In open-ended research scenarios where ground-truth answers are unavailable, current evaluation methods cannot effectively measure the epistemic confidence of generated content, making calibration difficult and leaving users susceptible to misleading or hallucinated information. To address this limitation, we propose a novel deep research agent that incorporates progressive confidence estimation and calibration within the report generation pipeline. Our system leverages a deliberative search model, featuring deep retrieval and multi-hop reasoning to ground outputs in verifiable evidence while assigning confidence scores to individual claims. Combined with a carefully designed workflow, this approach produces trustworthy reports with enhanced transparency. Experimental results and case studies demonstrate that our method substantially improves interpretability and significantly increases user trust.

顶级标签: agents llm model evaluation
详细标签: report generation confidence calibration trustworthiness evidence grounding deliberative search 或 搜索:

迈向可信的报告生成:一种具备渐进式置信度估计与校准的深度研究智能体 / Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration


1️⃣ 一句话总结

这篇论文提出了一种新的深度研究智能体,它通过在生成报告的每个步骤中评估和校准其陈述的置信度,来解决现有AI系统可能产生不可靠或虚假信息的问题,从而生成更透明、更值得用户信赖的研究报告。

源自 arXiv: 2604.05952