菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-27
📄 Abstract - Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization

Free-text explanations extend human label variation (HLV) beyond label disagreement by revealing the reasoning and preferences behind annotators' decisions. We study whether large language models (LLMs) can learn and reproduce such annotator-specific label-explanation behavior. Using two sentence-pair tasks with four annotators each -- natural language inference and paraphrase judgment -- we first analyze whether annotators exhibit stable individual patterns. We find that such patterns are weak at the single-annotation level due to strong input-content effects, but become detectable after input-content reduction and annotator-level aggregation. We then compare prompting and supervised fine-tuning (SFT) baselines and propose cross-annotator preference optimization (CAPO), which contrasts a target annotator's response with other valid but less target-specific annotations for the same input. Experiments show that prompting is limited and unstable, SFT better captures annotator-specific behavior, and CAPO further improves aggregation-aware imitation and judge-based attribution while preserving target-specific reasoning patterns under human validation. Overall, our results show that HLV can be learned as annotator-specific label-explanation behavior, suggesting a path toward scalable explanation-based annotation grounded in annotator histories rather than labels alone.

顶级标签: llm natural language processing behavior
详细标签: human label variation explanation generation preference optimization annotator modeling annotation behavior 或 搜索:

人类标注变异性作为稳定信号:通过跨标注者偏好优化学习标注者特有的解释行为 / Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization


1️⃣ 一句话总结

本文提出一种名为跨标注者偏好优化(CAPO)的方法,让大语言模型从不同标注者对同一文本的不同解释中学习每个人的独特偏好,从而生成更符合特定标注者风格的标签和解释,解决了传统方法只能学到“平均”行为的问题。

源自 arXiv: 2605.28802