When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

📄 Abstract - When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

Personalized large language models (LLMs) adapt model behavior to individual users to enhance user satisfaction, yet personalization can inadvertently distort factual reasoning. We show that when personalized LLMs face factual queries, there exists a phenomenon where the model generates answers aligned with a user's prior history rather than the objective truth, resulting in personalization-induced hallucinations that degrade factual reliability and may propagate incorrect beliefs, due to representational entanglement between personalization and factual representations. To address this issue, we propose Factuality-Preserving Personalized Steering (FPPS), a lightweight inference-time approach that mitigates personalization-induced factual distortions while preserving personalized behavior. We further introduce PFQABench, the first benchmark designed to jointly evaluate factual and personalized question answering under personalization. Experiments across multiple LLM backbones and personalization methods show that FPPS substantially improves factual accuracy while maintaining personalized performance.

当个性化产生误导：理解并缓解个性化大语言模型中的幻觉问题 / When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs

1️⃣ 一句话总结

这篇论文发现，为了让大语言模型更贴合用户习惯而进行的个性化调整，有时会扭曲事实推理，导致模型根据用户过往偏好而非客观事实来回答问题；为此，研究者提出了一种轻量级的方法，能在保持个性化服务的同时，有效减少这种因个性化引发的‘幻觉’错误。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要