基于二元反馈的个性化大语言模型:偏好校正优化框架 / Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
1️⃣ 一句话总结
本文提出了一种名为C-BPO的新方法,通过将目标用户的数据视为正面反馈、其他用户的数据视为隐含负面反馈,并利用正-无标签学习理论校正偏好重叠问题,使得大语言模型能更准确地学习每个用户的独特性,而不会牺牲其通用能力。
Large Language Model (LLM) personalization aims to align model behaviors with individual user preferences. Existing methods often focus on isolated user histories, neglecting the essential role of inter-user differences. We propose C-BPO, a framework that personalizes LLMs via preference-calibrated binary signals. By treating target user data as positive feedback and other users' data as an auxiliary set of implicit negative signals, C-BPO captures distinct inter-user differences. To mitigate the preference overlap issue, where shared task knowledge is erroneously penalized, we derive an objective grounded in Positive-Unlabeled (PU) learning theory. This approach purifies negative signals by subtracting ``positive bias'', ensuring alignment with unique idiosyncrasies without compromising general helpfulness. Empirical experiments across various personalization tasks and backbone LLMs show C-BPO consistently outperforms baselines, demonstrating the efficacy of preference-calibrated binary signals in modeling inter-user differences.
基于二元反馈的个性化大语言模型:偏好校正优化框架 / Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
本文提出了一种名为C-BPO的新方法,通过将目标用户的数据视为正面反馈、其他用户的数据视为隐含负面反馈,并利用正-无标签学习理论校正偏好重叠问题,使得大语言模型能更准确地学习每个用户的独特性,而不会牺牲其通用能力。
源自 arXiv: 2605.10043