菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-21
📄 Abstract - Do Emotions Influence Moral Judgment in Large Language Models?

Large language models have been extensively studied for emotion recognition and moral reasoning as distinct capabilities, yet the extent to which emotions influence moral judgment remains underexplored. In this work, we develop an emotion-induction pipeline that infuses emotion into moral situations and evaluate shifts in moral acceptability across multiple datasets and LLMs. We observe a directional pattern: positive emotions increase moral acceptability and negative emotions decrease it, with effects strong enough to reverse binary moral judgments in up to 20% of cases, and with susceptibility scaling inversely with model capability. Our analysis further reveals that specific emotions can sometimes behave contrary to what their valence would predict (e.g., remorse paradoxically increases acceptability). A complementary human annotation study shows humans do not exhibit these systematic shifts, indicating an alignment gap in current LLMs.

顶级标签: llm machine learning behavior
详细标签: emotion induction moral judgment alignment gap emotional valence evaluation 或 搜索:

情绪会影响大语言模型的道德判断吗? / Do Emotions Influence Moral Judgment in Large Language Models?


1️⃣ 一句话总结

本研究通过设计情绪诱导实验,发现大语言模型在道德判断中会受到情绪显著影响——正面情绪让模型更倾向于认为某个行为道德上可接受,负面情绪则相反,这种影响足以在多达20%的情况下完全翻转模型的道德判断,而且能力更强的模型受情绪影响更小;相比之下,人类并不会出现这种系统性偏差,这揭示了当前AI模型与人类在情绪-道德交互机制上的对齐差距。

源自 arXiv: 2604.19125