探究函数向量的语言无关性:以机器翻译为例 / Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation
1️⃣ 一句话总结
本文研究了在多语言大模型中,从上下文学习中提取的“函数向量”是否具有语言无关性,并以机器翻译任务为例,证明从单个翻译方向提取的函数向量能有效提升多种未见语言的翻译质量,且移除该向量会显著损害翻译性能,说明这些向量捕捉到了跨语言共享的翻译知识。
Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether the same holds for function vectors. We study whether FVs exhibit language-agnosticity, using machine translation as a case study. Across three decoder-only multilingual LLMs, we find that translation FVs extracted from a single English$\rightarrow$Target direction transfer to other target languages, consistently improving the rank of correct translation tokens across multiple unseen languages. Ablation results show that removing the FV degrades translation across languages with limited impact on unrelated tasks. We further show that base-model FVs transfer to instruction-tuned variants and partially generalize from word-level to sentence-level translation.
探究函数向量的语言无关性:以机器翻译为例 / Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation
本文研究了在多语言大模型中,从上下文学习中提取的“函数向量”是否具有语言无关性,并以机器翻译任务为例,证明从单个翻译方向提取的函数向量能有效提升多种未见语言的翻译质量,且移除该向量会显著损害翻译性能,说明这些向量捕捉到了跨语言共享的翻译知识。
源自 arXiv: 2604.19678