菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-24
📄 Abstract - Personal Information Parroting in Language Models

Modern language models (LM) are trained on large scrapes of the Web, containing millions of personal information (PI) instances, many of which LMs memorize, increasing privacy risks. In this work, we develop the regexes and rules (R&R) detector suite to detect email addresses, phone numbers, and IP addresses, which outperforms the best regex-based PI detectors. On a manually curated set of 483 instances of PI, we measure memorization: finding that 13.6% are parroted verbatim by the Pythia-6.9b model, i.e., when the model is prompted with the tokens that precede the PI in the original document, greedy decoding generates the entire PI span exactly. We expand this analysis to study models of varying sizes (160M-6.9B) and pretraining time steps (70k-143k iterations) in the Pythia model suite and find that both model size and amount of pretraining are positively correlated with memorization. Even the smallest model, Pythia-160m, parrots 2.7% of the instances exactly. Consequently, we strongly recommend that pretraining datasets be aggressively filtered and anonymized to minimize PI parroting.

顶级标签: llm model training data
详细标签: privacy memorization personal information data filtering model scaling 或 搜索:

语言模型中的个人信息复述 / Personal Information Parroting in Language Models


1️⃣ 一句话总结

这篇论文发现,大语言模型在训练时会记住并精确复述训练数据中的大量个人信息(如邮箱、电话),且模型越大、训练越久,这种隐私泄露风险就越高,因此建议对训练数据进行严格的过滤和匿名化处理。

源自 arXiv: 2602.20580