迈向安全的检索增强生成:威胁、防御与基准测试的全面综述 / Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks
1️⃣ 一句话总结
这篇论文首次全面梳理了检索增强生成(RAG)系统的安全风险,系统性地分析了其工作流程中可能遭受的数据投毒、对抗攻击等威胁,并总结了输入输出两端的防御技术及评估标准,旨在为构建更安全可靠的RAG系统提供指导。
Retrieval-Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by incorporating external knowledge bases. However, the multi-module architecture of RAG introduces complex system-level security vulnerabilities. Guided by the RAG workflow, this paper analyzes the underlying vulnerability mechanisms and systematically categorizes core threat vectors such as data poisoning, adversarial attacks, and membership inference attacks. Based on this threat assessment, we construct a taxonomy of RAG defense technologies from a dual perspective encompassing both input and output stages. The input-side analysis reviews data protection mechanisms including dynamic access control, homomorphic encryption retrieval, and adversarial pre-filtering. The output-side examination summarizes advanced leakage prevention techniques such as federated learning isolation, differential privacy perturbation, and lightweight data sanitization. To establish a unified benchmark for future experimental design, we consolidate authoritative test datasets, security standards, and evaluation frameworks. To the best of our knowledge, this paper presents the first end-to-end survey dedicated to the security of RAG systems. Distinct from existing literature that isolates specific vulnerabilities, we systematically map the entire pipeline-providing a unified analysis of threat models, defense mechanisms, and evaluation benchmarks. By enabling deep insights into potential risks, this work seeks to foster the development of highly robust and trustworthy next-generation RAG systems.
迈向安全的检索增强生成:威胁、防御与基准测试的全面综述 / Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks
这篇论文首次全面梳理了检索增强生成(RAG)系统的安全风险,系统性地分析了其工作流程中可能遭受的数据投毒、对抗攻击等威胁,并总结了输入输出两端的防御技术及评估标准,旨在为构建更安全可靠的RAG系统提供指导。
源自 arXiv: 2603.21654