sui-1: Grounded and Verifiable Long-Form Summarization

📄 Abstract - sui-1: Grounded and Verifiable Long-Form Summarization

Large language models frequently generate plausible but unfaithful summaries that users cannot verify against source text, a critical limitation in compliance-sensitive domains such as government and legal analysis. We present sui-1, a 24B parameter model that produces abstractive summaries with inline citations, enabling users to trace each claim to its source sentence. Our synthetic data pipeline combines chain-of-thought prompting with multi-stage verification, generating over 22,000 high-quality training examples across five languages from diverse sources including parliamentary documents, web text, and Wikipedia. Evaluation shows sui-1 significantly outperforms all tested open-weight baselines, including models with 3x more parameters. These results demonstrate that task-specific training substantially outperforms scale alone for citation-grounded summarization. Model weights and an interactive demo are publicly available.

sui-1：基于引用的可验证长文本摘要模型 / sui-1: Grounded and Verifiable Long-Form Summarization

1️⃣ 一句话总结

这篇论文提出了一个名为sui-1的240亿参数模型，它能为生成的摘要自动添加引用，让用户可以轻松查证每个结论的来源，从而解决了大语言模型在政府、法律等关键领域生成不可靠摘要的问题，并且其性能超过了参数量更大的模型。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要