菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-25
📄 Abstract - MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. While existing hallucination detection methods employ LLM-as-a-judge to verify LLM outputs against retrieved evidence, they suffer from inherent confirmation bias, where the verifier inadvertently reproduces the errors of the original generation. To address this, we introduce Multi-Agent Reinforced Self-Check for Hallucination (MARCH), a framework that enforces rigorous factual alignment by leveraging deliberate information asymmetry. MARCH orchestrates a collaborative pipeline of three specialized agents: a Solver, a Proposer, and a Checker. The Solver generates an initial RAG response, which the Proposer decomposes into claim-level verifiable atomic propositions. Crucially, the Checker validates these propositions against retrieved evidence in isolation, deprived of the Solver's original output. This well-crafted information asymmetry scheme breaks the cycle of self-confirmation bias. By training this pipeline with multi-agent reinforcement learning (MARL), we enable the agents to co-evolve and optimize factual adherence. Extensive experiments across hallucination benchmarks demonstrate that MARCH substantially reduces hallucination rates. Notably, an 8B-parameter LLM equipped with MARCH achieves performance competitive with powerful closed-source models. MARCH paves a scalable path for factual self-improvement of LLMs through co-evolution. The code is at this https URL.

顶级标签: llm agents multi-agents
详细标签: hallucination detection retrieval-augmented generation multi-agent reinforcement learning factual alignment self-check 或 搜索:

MARCH:用于大语言模型幻觉检测的多智能体强化自检框架 / MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination


1️⃣ 一句话总结

这篇论文提出了一个名为MARCH的多智能体强化学习框架,通过让三个分工合作的智能体在信息不对称的条件下相互校验,有效打破了大语言模型在检索增强生成中自我确认的偏见循环,从而显著降低了模型输出中的事实性错误。

源自 arXiv: 2603.24579