菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-07-02
📄 Abstract - Gaming Consensus: Coordinated Manipulation in Crowdsourced Fact-Checking

Crowdsourced fact-checking systems have been adopted by major social media companies such as X, Meta, TikTok and Google with the aim of combating misleading information at scale without relying on centralized editorial control. These systems have been developed around a common underlying concept: a bridging mechanism that identifies notes flagging misleading information when they receive support from people with different perspectives rather than simple majority support. To our knowledge the only publicly disclosed bridging algorithms deployed for fact-checking are based on matrix factorization, as deployed by both X and Meta, augmented with additional components addressing abuse, targeted manipulation, and contributor brigades. This work examines the core matrix factorization portion of these systems, presenting theoretical and empirical evaluations of the degree to which coordinated users could vote strategically by leveraging the latent representations to fabricate the appearance of synthetic consensus within the bridging mechanism. Using historic production data, we find that up to 10.7% of lower quality notes could be manipulated above consensus thresholds using less than 10 ratings. We complement these findings with a theoretical analysis, revealing counterintuitively that rating a note as "Not Helpful" can increase its helpfulness score, as well as a cost model quantifying manipulation effort. We have developed and deployed mitigations within X's Community Notes algorithm to address synthetic consensus.

顶级标签: systems machine learning
详细标签: crowdsourced fact-checking bridging algorithm matrix factorization coordinated manipulation vulnerability analysis 或 搜索:

操纵共识:众包事实核查中的协调操控 / Gaming Consensus: Coordinated Manipulation in Crowdsourced Fact-Checking


1️⃣ 一句话总结

本文揭示了X和Meta等平台使用的众包事实核查系统(如社群笔记)存在被少数用户通过策略性投票攻击的漏洞,研究发现不到10条恶意评价就可能使劣质笔记通过审核,甚至出现“点‘无帮助’反而提高评分”的反直觉现象,并提出了实际防御方案。

源自 arXiv: 2607.01824