语言对口语词汇分类有影响吗?一种多语言生成式元学习方法 / Does language matter for spoken word classification? A multilingual generative meta-learning approach
1️⃣ 一句话总结
本文通过对比单语言、双语言和多语言模型在口语词汇分类任务上的表现,发现多语言模型性能虽然最好,但语言数量的增加对性能提升效果有限,而训练过程中看到的独特数据时长才是更关键的影响因素。
Meta-learning has been shown to have better performance than supervised learning for few-shot monolingual spoken word classification. However, the meta-learning approach remains under-explored in multilingual spoken word classification. In this paper, we apply the Generative Meta-Continual Learning algorithm to spoken word classification. The generative nature of this algorithm makes it viable for use in application, and the meta-learning aspect promotes generalisation, which is crucial in a multilingual setting. We train monolingual models on English, German, French, and Catalan, a bilingual model on English and German, and a multilingual model on all four languages. We find that although the multilingual model performs best, the differences between model performance is unexpectedly low. We also find that the hours of unique data seen during training seems to be a stronger performance indicator than the number of languages included in the training data.
语言对口语词汇分类有影响吗?一种多语言生成式元学习方法 / Does language matter for spoken word classification? A multilingual generative meta-learning approach
本文通过对比单语言、双语言和多语言模型在口语词汇分类任务上的表现,发现多语言模型性能虽然最好,但语言数量的增加对性能提升效果有限,而训练过程中看到的独特数据时长才是更关键的影响因素。
源自 arXiv: 2605.13084