菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-11
📄 Abstract - Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework

Large Language Model (LLM) personalization aims to align model behaviors with individual user preferences. Existing methods often focus on isolated user histories, neglecting the essential role of inter-user differences. We propose C-BPO, a framework that personalizes LLMs via preference-calibrated binary signals. By treating target user data as positive feedback and other users' data as an auxiliary set of implicit negative signals, C-BPO captures distinct inter-user differences. To mitigate the preference overlap issue, where shared task knowledge is erroneously penalized, we derive an objective grounded in Positive-Unlabeled (PU) learning theory. This approach purifies negative signals by subtracting ``positive bias'', ensuring alignment with unique idiosyncrasies without compromising general helpfulness. Empirical experiments across various personalization tasks and backbone LLMs show C-BPO consistently outperforms baselines, demonstrating the efficacy of preference-calibrated binary signals in modeling inter-user differences.

顶级标签: llm model training
详细标签: personalization preference learning binary feedback positive-unlabeled learning calibration 或 搜索:

基于二元反馈的个性化大语言模型:偏好校正优化框架 / Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework


1️⃣ 一句话总结

本文提出了一种名为C-BPO的新方法,通过将目标用户的数据视为正面反馈、其他用户的数据视为隐含负面反馈,并利用正-无标签学习理论校正偏好重叠问题,使得大语言模型能更准确地学习每个用户的独特性,而不会牺牲其通用能力。

源自 arXiv: 2605.10043