从备选到前线:大语言模型何时能成为人类视角的卓越标注者? / From Fallback to Frontline: When Can LLMs be Superior Annotators of Human Perspectives?
1️⃣ 一句话总结
这篇论文研究发现,在预测特定人群对主观问题的集体意见时,大语言模型因其低方差和结构特性,常常能比人类标注者(包括群体内部成员)表现得更好,从而可以作为一种有原则的工具,而不仅仅是节省成本的替代方案。
Although large language models (LLMs) are increasingly used as annotators at scale, they are typically treated as a pragmatic fallback rather than a faithful estimator of human perspectives. This work challenges that presumption. By framing perspective-taking as the estimation of a latent group-level judgment, we characterize the conditions under which modern LLMs can outperform human annotators, including in-group humans, when predicting aggregate subgroup opinions on subjective tasks, and show that these conditions are common in practice. This advantage arises from structural properties of LLMs as estimators, including low variance and reduced coupling between representation and processing biases, rather than any claim of lived experience. Our analysis identifies clear regimes where LLMs act as statistically superior frontline estimators, as well as principled limits where human judgment remains essential. These findings reposition LLMs from a cost-saving compromise to a principled tool for estimating collective human perspectives.
从备选到前线:大语言模型何时能成为人类视角的卓越标注者? / From Fallback to Frontline: When Can LLMs be Superior Annotators of Human Perspectives?
这篇论文研究发现,在预测特定人群对主观问题的集体意见时,大语言模型因其低方差和结构特性,常常能比人类标注者(包括群体内部成员)表现得更好,从而可以作为一种有原则的工具,而不仅仅是节省成本的替代方案。
源自 arXiv: 2604.17968