HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

📄 Abstract - HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

The reliability of Large Language Models (LLMs) in high-stakes domains such as healthcare, law, and scientific discovery is often compromised by hallucinations. These failures typically stem from two sources: data-driven hallucinations and reasoning-driven hallucinations. However, existing detection methods usually address only one source and rely on task-specific heuristics, limiting their generalization to complex scenarios. To overcome these limitations, we introduce the Hallucination Risk Bound, a unified theoretical framework that formally decomposes hallucination risk into data-driven and reasoning-driven components, linked respectively to training-time mismatches and inference-time instabilities. This provides a principled foundation for analyzing how hallucinations emerge and evolve. Building on this foundation, we introduce HalluGuard, an NTK-based score that leverages the induced geometry and captured representations of the NTK to jointly identify data-driven and reasoning-driven hallucinations. We evaluate HalluGuard on 10 diverse benchmarks, 11 competitive baselines, and 9 popular LLM backbones, consistently achieving state-of-the-art performance in detecting diverse forms of LLM hallucinations.

HalluGuard：揭秘大语言模型中数据驱动与推理驱动的幻觉 / HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

1️⃣ 一句话总结

这篇论文提出了一个统一的理论框架来分解大语言模型的幻觉风险，并基于此开发了一个名为HalluGuard的检测工具，能够同时识别由数据问题和推理过程导致的幻觉，在多种测试中表现优异。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要