THIVLVC:基于检索增强的拉丁语依存句法分析 / THIVLVC: Retrieval Augmented Dependency Parsing for Latin
1️⃣ 一句话总结
这篇论文提出了一个名为THIVLVC的两阶段系统,它通过从树库中检索结构相似的句子来辅助大语言模型,从而显著提升了拉丁语诗歌文本的依存句法分析准确率,并揭示了现有标注数据中的不一致性问题。
We describe THIVLVC, a two-stage system for the EvaLatin 2026 Dependency Parsing task. Given a Latin sentence, we retrieve structurally similar entries from the CIRCSE treebank using sentence length and POS n-gram similarity, then prompt a large language model to refine the baseline parse from UDPipe using the retrieved examples and UD annotation guidelines. We submit two configurations: one without retrieval and one with retrieval (RAG). On poetry (Seneca), THIVLVC improves CLAS by +17 points over the UDPipe baseline; on prose (Thomas Aquinas), the gain is +1.5 CLAS. A double-blind error analysis of 300 divergences between our system and the gold standard reveals that, among unanimous annotator decisions, 53.3% favour THIVLVC, showing annotation inconsistencies both within and across treebanks.
THIVLVC:基于检索增强的拉丁语依存句法分析 / THIVLVC: Retrieval Augmented Dependency Parsing for Latin
这篇论文提出了一个名为THIVLVC的两阶段系统,它通过从树库中检索结构相似的句子来辅助大语言模型,从而显著提升了拉丁语诗歌文本的依存句法分析准确率,并揭示了现有标注数据中的不一致性问题。
源自 arXiv: 2604.05564