菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-18
📄 Abstract - Vector RAG vs LLM-Compiled Wiki: A Preregistered Comparison on a Small Multi-Domain Research

We preregistered a comparison of two ways to help an LLM answer questions over a small research corpus: a single-round Vector RAG system and an LLM-compiled markdown wiki. Both systems answered the same 13 questions over 24 papers using the same answer-generating model, and their answers were scored by blinded LLM judges. The wiki scored much better at connecting findings across papers, but its advantage in answer organization was not strong after judge adjustment. RAG met the preregistered test for single-fact lookup questions. The clean query-side cost result went against the expected wiki advantage: under the tested setup, the wiki used far more query tokens than RAG, so it could not recover any upfront build cost through cheaper queries. Two exploratory analyses changed how we interpret the result. First, claim-level citation checking favored the wiki: its cited pages more often supported the exact claims being made, even though RAG scored better on the overall groundedness rubric. Second, a decomposition-based RAG variant recovered most of the wiki's advantage on cross-paper synthesis at lower LLM-token cost, but it did not recover the wiki advantage in claim-by-claim citation support. The main conclusion is that grounded research synthesis is not a single capability. Systems can differ in how well they organize evidence, how well their citations support each claim, and how much they cost to run. In this study, no architecture was best on all three.

顶级标签: llm retrieval augmented generation evaluation
详细标签: rag wiki compilation research synthesis citation grounding cost analysis 或 搜索:

向量RAG与LLM编译维基:基于预注册的小型多领域研究比较 / Vector RAG vs LLM-Compiled Wiki: A Preregistered Comparison on a Small Multi-Domain Research


1️⃣ 一句话总结

本研究通过预注册实验,比较了两种帮助大语言模型回答研究问题的方法——向量检索增强生成(RAG)和大语言模型编译的Markdown维基,发现维基在跨论文综合和引用准确性上表现更优但成本更高,而RAG在单事实查找和开销上更具优势,表明研究综合能力并非单一维度,不同架构各有优劣。

源自 arXiv: 2605.18490