菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-28
📄 Abstract - Do LLMs Capture Embodied Cognition and Cultural Variation? Cross-Linguistic Evidence from Demonstratives

Do large language models (LLMs) truly acquire embodied cognition and cultural conventions from text? We introduce demonstratives, fundamental spatial expressions like "this/that" in English and "zhè/nà" in Chinese, as a novel probe for grounded knowledge. Using 6,400 responses from 320 native speakers, we establish a human baseline: English speakers reliably distinguish proximal-distal referents but struggle with perspective-taking, while Chinese speakers switch perspectives fluently but tolerate distal ambiguity. In contrast, five state-of-the-art LLMs fail to inherently understand the proximal-distal contrast and show no cultural differences, defaulting to English-centric reasoning. Our study contributes (i) a new task, based on demonstratives, as a new lens for evaluating embodied cognition and cultural conventions; (ii) empirical evidence of cross-cultural asymmetries in human interpretation; (iii) a new perspective on the egocentric-sociocentric debate, showing both orientations coexist but vary across languages; and (iv) a call to address individual variation in future model design.

顶级标签: llm natural language processing multi-modal
详细标签: embodied cognition cultural variation demonstratives cross-linguistic evaluation 或 搜索:

大语言模型能否捕捉具身认知与文化差异?来自指示词的跨语言证据 / Do LLMs Capture Embodied Cognition and Cultural Variation? Cross-Linguistic Evidence from Demonstratives


1️⃣ 一句话总结

该研究通过对比人类(英语和汉语母语者)与大语言模型在使用“这个/那个”类指示词时的空间认知差异,发现大语言模型不仅无法理解基础的远近空间概念,也缺乏人类特有的跨文化视角切换能力,揭示了当前模型在具身认知和文化理解上的深层局限。

源自 arXiv: 2604.25423