📄 论文总结
CLaRa:通过连续潜在推理桥接检索与生成 / CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
1️⃣ 一句话总结
这篇论文提出了一个名为CLaRa的统一框架,通过将检索和生成过程整合到同一个连续空间中进行联合优化,有效解决了传统检索增强生成方法中上下文过长和模块脱节的问题,并在多个问答基准测试中取得了领先性能。
Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge but still suffers from long contexts and disjoint retrieval-generation optimization. In this work, we propose CLaRa (Continuous Latent Reasoning), a unified framework that performs embedding-based compression and joint optimization in a shared continuous space. To obtain semantically rich and retrievable compressed vectors, we introduce SCP, a key-preserving data synthesis framework using QA and paraphrase supervision. CLaRa then trains the reranker and generator end-to-end via a single language modeling loss, with gradients flowing through both modules using a differentiable top-k estimator. Theoretically, this unified optimization aligns retrieval relevance with answer quality. Experiments across multiple QA benchmarks show that CLaRa achieves state-of-the-art compression and reranking performance, often surpassing text-based fine-tuned baselines.
CLaRa:通过连续潜在推理桥接检索与生成 / CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
这篇论文提出了一个名为CLaRa的统一框架,通过将检索和生成过程整合到同一个连续空间中进行联合优化,有效解决了传统检索增强生成方法中上下文过长和模块脱节的问题,并在多个问答基准测试中取得了领先性能。