菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-30
📄 Abstract - Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Large Language Models (LLMs) can strongly shape social discourse, yet datasets investigating how LLM outputs vary across controlled social and contextual prompting remain sparse. Cognitive Digital Shadows (CDS) is a 190,000-record synthetic corpus supporting analyses of LLM-generated discourse. Each CDS record is generated by one of 19 LLMs, prompted to shadow either a human persona or an AI-assistant role. CDS contains LLM responses on 4 controversial societal topics: vaccines/healthcare, social media disinformation, the gender gap in science, and STEM stereotypes. Persona-conditioned records encode 17 sociodemographic and psychological attributes, providing data linking LLMs' prompts, language, stances and reasoning. Texts are validated for topic anchoring and can support emotional analyses via interpretable NLP (e.g. textual forma mentis networks). CDS is enriched by a pooling platform with user-friendly dashboards, enabling easy, interactive group-level comparisons of emotional and semantic framing across personas, topics and models. The CDS prompting framework supports future audits of LLMs' bias, social sensitivity and alignment.

顶级标签: llm natural language processing benchmark
详细标签: bias social discourse persona prompting emotional analysis synthetic dataset 或 搜索:

当大型语言模型模拟人类个性、人口统计特征和社交媒体行为时,它们如何对社会议题进行辩论 / Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior


1️⃣ 一句话总结

这篇论文介绍了一个名为“认知数字影子”(CDS)的大型合成数据集,通过让19种不同的大型语言模型模拟具有不同个性、背景和社交媒体行为的虚拟人物,来研究它们对疫苗、虚假信息等四个有争议的社会议题的回应,从而帮助理解AI模型如何受提示影响、存在哪些偏见以及如何与人类观点对齐。

源自 arXiv: 2604.27624