菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-23
📄 Abstract - Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs

LLMs have been showing limitations when it comes to cultural coverage and competence, and in some cases show regional biases such as amplifying Western and Anglocentric viewpoints. While there have been works analysing the cultural capabilities of LLMs, there has not been specific work on highlighting LLM regional preferences when it comes to cultural-related questions. In this work, we propose a new dataset based on a comprehensive taxonomy of Culture-Related Open Questions (CROQ). The results show that, contrary to previous cultural bias work, LLMs show a clear tendency towards countries such as Japan. Moveover, our results show that when prompting in languages such as English or other high-resource ones, LLMs tend to provide more diverse outputs and show less inclinations towards answering questions highlighting countries for which the input language is an official language. Finally, we also investigate at which point of LLM training this cultural bias emerges, with our results suggesting that the first clear signs appear after supervised fine-tuning, and not during pre-training.

顶级标签: llm natural language processing model evaluation
详细标签: cultural bias regional bias dataset supervised fine-tuning evaluation 或 搜索:

为什么所有大型语言模型都对日本文化着迷?——论大型语言模型中隐藏的文化与区域偏见 / Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs


1️⃣ 一句话总结

该研究通过构建一个基于文化相关问题分类的新数据集,发现大型语言模型在文化回答中存在明显的区域偏好,尤其是对日本表现出异常倾向,并且这种偏见主要出现在监督微调阶段,而非预训练阶段。

源自 arXiv: 2604.21751