理据缺口:基于主张条件重评分的虚假信息检测方法 / The Warrant Gap: Claim-Conditioned Re-scoring for Fact-Checking
1️⃣ 一句话总结
针对现有事实核查模型常引用与待核主张不匹配的证据的问题,本文提出一种名为SIFT的新型方法,该方法通过对提取的证据片段进行基于原始主张的条件重评分,再结合自动化的WSP检测机制,能够显著提升证据与主张之间的逻辑关联度,使模型在多个权威数据集上的准确率和可靠性均取得明显改善。
Fact-checking systems built on LLMs achieve high verdict accuracy on standard benchmarks, yet routinely output Supports labels whose cited evidence does not license the claim. Structured decomposition is the natural way to inspect those warrants, but rigid extraction protocols strip the full-claim context that facets need. We introduce SIFT -- claim-conditioned re-scoring of extracted evidence spans against the full claim -- paired with WSP (Warranted Supports Proportion), an automatic NLI check that the cited warrant entails the claim. We evaluate on FEVER, SciFact, 5PILS, and DP across four open-source backbones. SIFT recovers accuracy on cells where naive decomposition costs up to 27.6 points, while raising WSP above direct prompting; WSP itself calibrates against human gold evidence at AUC 0.92 and precision 0.98.
理据缺口:基于主张条件重评分的虚假信息检测方法 / The Warrant Gap: Claim-Conditioned Re-scoring for Fact-Checking
针对现有事实核查模型常引用与待核主张不匹配的证据的问题,本文提出一种名为SIFT的新型方法,该方法通过对提取的证据片段进行基于原始主张的条件重评分,再结合自动化的WSP检测机制,能够显著提升证据与主张之间的逻辑关联度,使模型在多个权威数据集上的准确率和可靠性均取得明显改善。
源自 arXiv: 2606.24627