从结构视角看大型语言模型的多语言能力 / Multilinguality of Large Language Models From a Structural Perspective
1️⃣ 一句话总结
本文通过分析大型语言模型内部的语言结构表示,发现低资源语言与英语的结构差异远大于高、中资源语言,并且针对特定语言的后训练过程会改变模型结构,但不会破坏不同语言之间的相对关系。
Large language models (LLMs) have excelled in processing multiple languages through pre- and post-training on multilingual data, even though English dominates the training data. Prior work focusing on token representations has revealed how those LLMs process non-English text. Although these analyses have provided insightful findings, they fail to capture a structural view, which is an inherent property of language. In this study, we explore the multilinguality of LLMs through representational structural analysis. Our findings reveal that low-resource languages are structurally more different from English than high- and mid-resource languages, and that language-specific post-training alters their structures while preserving inter-language relationships.
从结构视角看大型语言模型的多语言能力 / Multilinguality of Large Language Models From a Structural Perspective
本文通过分析大型语言模型内部的语言结构表示,发现低资源语言与英语的结构差异远大于高、中资源语言,并且针对特定语言的后训练过程会改变模型结构,但不会破坏不同语言之间的相对关系。
源自 arXiv: 2606.01800