菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-10
📄 Abstract - System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5

Recently, large language models (LLMs) have achieved promising progress in the fields of classical Chinese translation and the generation of classical poetry. However, domain-specific research on precise translation and affective-semantic understanding of classical poetry remains limited. The main challenge is that most studies treat the poetic appreciation task as a general-domain problem, neglecting the distinctive features of poetic appreciation, while high-quality and domain-specific datasets are extremely limited. To address this limitation, we decompose the task into three subtasks: term interpretation, semantic interpretation, and emotional inference. Based on multiple open-source datasets, we perform data cleansing and alignment to construct the Classical Chinese Poetry Instruction Pair Dataset (CCPoetry-49K), which comprises 49,404 high-quality instruction-response pairs explicitly optimized for this domain. We then propose a domain-specialized LLM, called PoetryQwen, by applying Low-Rank Adaptation (LoRA) to fine-tune the Qwen2.5-14B model. Experimental results on the CCL25-Eval Task 5 benchmark demonstrate that PoetryQwen achieves a score of 0.757, representing a 9.7% improvement over the Qwen2.5-14B-Instruct baseline (0.690). These findings clearly indicate that PoetryQwen significantly enhances performance in precise translation and emotional understanding of classical poetry. We present new dataset and methodological considerations intended to support the domain-specific optimization of LLMs.

顶级标签: llm natural language processing data
详细标签: classical poetry chinese translation affective understanding domain-specific dataset lora fine-tuning 或 搜索:

CCL25评估任务5系统报告:新数据集与基于LoRA微调的Qwen2.5模型 / System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5


1️⃣ 一句话总结

该研究为了解决古典诗歌翻译与情感理解任务中缺乏高质量专业数据集的问题,构建了一个包含近5万条指令-回答对的古典诗歌专用数据集CCPoetry-49K,并基于LoRA方法微调Qwen2.5模型得到PoetryQwen,在评估中相比原模型提升了9.7%的性能,显著增强了古典诗歌的精准翻译和情感理解能力。

源自 arXiv: 2606.12392