菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-23
📄 Abstract - Automatic Part-of-Speech Tagging of Arabic-English Dictionary Senses through WordNet

This paper proposed an algorithm for part-of-speech (POS) tagging senses of a bilingual dictionary. The algorithm is applied on the Al-Mawrid Arabic-English dictionary. The tagging task is accomplished by transferring the POS tags of the English translation equivalences (TEs) to the dictionary senses after dis-ambiguities process. The English POS tags of senses are acquired from the Princeton WordNet. POS tagging of bilingual dictionary senses is prerequisite to link a bilingual dictionary to WordNet and/or standardizing that dictionary into WordNet-LMF format where the synset (set of synonyms), not word, is the basic brick. The registered accuracy is high though the cost is little. Building NLP/HLT tools needs linguistic experts, large investments, and long time. For statistical approach, we need large annotated corpora and for rule-based approach, we need large lexicon that contains rich linguistic and world knowledge. That motivates the appearance of what are called resource-light approaches to develop natural language processing (NLP) tools for poor-resource languages.

顶级标签: llm natural language processing
详细标签: pos tagging bilingual dictionary wordnet arabic-english resource-light nlp 或 搜索:

通过WordNet自动标注阿英词典词义的词性 / Automatic Part-of-Speech Tagging of Arabic-English Dictionary Senses through WordNet


1️⃣ 一句话总结

该论文提出了一种低成本、高准确率的算法,能自动为双语词典(以阿英词典为例)中的每个词义标注词性,方法是将英文WordNet中的词性标签转移到经过歧义消除后的词典词条上,从而帮助资源匮乏的语言快速构建自然语言处理工具。

源自 arXiv: 2606.24359