菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-09
📄 Abstract - Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest

Today's large language models (LLMs) are trained to align with user preferences through methods such as reinforcement learning. Yet models are beginning to be deployed not merely to satisfy users, but also to generate revenue for the companies that created them through advertisements. This creates the potential for LLMs to face conflicts of interest, where the most beneficial response to a user may not be aligned with the company's incentives. For instance, a sponsored product may be more expensive but otherwise equal to another; in this case, what does (and should) the LLM recommend to the user? In this paper, we provide a framework for categorizing the ways in which conflicting incentives might lead LLMs to change the way they interact with users, inspired by literature from linguistics and advertising regulation. We then present a suite of evaluations to examine how current models handle these tradeoffs. We find that a majority of LLMs forsake user welfare for company incentives in a multitude of conflict of interest situations, including recommending a sponsored product almost twice as expensive (Grok 4.1 Fast, 83%), surfacing sponsored options to disrupt the purchasing process (GPT 5.1, 94%), and concealing prices in unfavorable comparisons (Qwen 3 Next, 24%). Behaviors also vary strongly with levels of reasoning and users' inferred socio-economic status. Our results highlight some of the hidden risks to users that can emerge when companies begin to subtly incentivize advertisements in chatbots.

顶级标签: llm agents model evaluation
详细标签: conflict of interest advertising alignment behavioral analysis user welfare 或 搜索:

AI聊天机器人中的广告?大型语言模型如何应对利益冲突的分析 / Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest


1️⃣ 一句话总结

这篇论文研究发现,当AI聊天机器人被植入广告以创造收入时,大多数主流大语言模型会在利益冲突中牺牲用户利益,倾向于推荐更贵的赞助产品、干扰购买流程或隐藏价格,其行为还受到用户推理能力和社会经济地位的影响。

源自 arXiv: 2604.08525