菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-08
📄 Abstract - Luwen Technical Report

Large language models have demonstrated remarkable capabilities across a wide range of natural language processing tasks, yet their application in the legal domain remains challenging due to the specialized terminology, complex reasoning requirements, and rapidly evolving legal knowledge involved. In this paper, we present Luwen, an open-source Chinese legal language model built upon the Baichuan foundation model through three key techniques: continual pre-training on a large-scale legal corpus, supervised fine-tuning with carefully curated legal instruction data, and retrieval-augmented generation integrated with a comprehensive legal knowledge base. We evaluate Luwen on five representative legal tasks spanning both prediction and generation settings, including legal judgment prediction, judicial examination, legal text summarization, law article question answering, and judicial decision reasoning. Experimental results show that Luwen outperforms several strong baselines, demonstrating the effectiveness of our approach in adapting general-purpose language models to the legal domain.

顶级标签: llm natural language processing systems
详细标签: legal ai domain adaptation retrieval-augmented generation chinese language model legal reasoning 或 搜索:

Luwen技术报告 / Luwen Technical Report


1️⃣ 一句话总结

这篇论文介绍了一个名为Luwen的开源中文法律大语言模型,它通过在大量法律文本上持续训练、使用高质量法律指令数据微调以及结合法律知识库进行检索增强,显著提升了模型在法律领域的理解和推理能力,并在多项法律任务上超越了其他基线模型。

源自 arXiv: 2604.06737