菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-21
📄 Abstract - AlignCultura: Towards Culturally Aligned Large Language Models?

Cultural alignment in Large Language Models (LLMs) is essential for producing contextually aware, respectful, and trustworthy outputs. Without it, models risk generating stereotyped, insensitive, or misleading responses that fail to reflect cultural diversity w.r.t Helpful, Harmless, and Honest (HHH) paradigm. Existing benchmarks represent early steps toward cultural alignment; yet, no benchmarks currently enables systematic evaluation of cultural alignment in line with UNESCO's principles of cultural diversity w.r.t HHH paradigm. Therefore, to address this gap, we built Align-Cultura, two-stage pipeline for cultural alignment. Stage I constructs CULTURAX, the HHH-English dataset grounded in the UNESCO cultural taxonomy, through Query Construction, which reclassifies prompts, expands underrepresented domains (or labels), and prevents data leakage with SimHash. Then, Response Generation pairs prompts with culturally grounded responses via two-stage rejection sampling. The final dataset contains 1,500 samples spanning 30 subdomains of tangible and intangible cultural forms. Stage II benchmarks CULTURAX on general-purpose models, culturally fine-tuned models, and open-weight LLMs (Qwen3-8B and DeepSeek-R1-Distill-Qwen-7B). Empirically, culturally fine-tuned models improve joint HHH by 4%-6%, reduce cultural failures by 18%, achieve 10%-12% efficiency gains, and limit leakage to 0.3%.

顶级标签: llm benchmark evaluation
详细标签: cultural alignment hhh paradigm dataset construction rejection sampling fine-tuning 或 搜索:

AlignCultura:走向文化对齐的大语言模型? / AlignCultura: Towards Culturally Aligned Large Language Models?


1️⃣ 一句话总结

本文提出AlignCultura,一个两阶段流程,通过基于联合国教科文组织文化分类法构建包含30个子领域的评估数据集CULTURAX,系统性地衡量和提升大语言模型在有用性、无害性和诚实性方面的文化对齐能力,实验表明文化微调可将模型的文化失误率降低18%,同时提升综合表现4%–6%。

源自 arXiv: 2604.19016