大型语言模型中的句子级语境适应现象 / Sentence-Level Contextual Entrainment in Large Language Models
1️⃣ 一句话总结
这项研究发现,大型语言模型存在一种“句子级语境适应”现象:当提示语中包含特定句子(即使是虚构的),模型在生成时会不自觉地提高该句子中词语的出现概率;并且模型越大,这种现象越弱,且仅需关闭2%-4%的注意力头就能有效抑制该现象而不影响模型性能。
Contextual entrainment, which is a newly discovered phenomenon in large language models (LLMs), refers to the tendency of a model to assign higher probabilities to tokens that appear in its context. In this work, we extend this phenomenon from the token level to the sentence level by examining the per-token mean log-probability of a sentence instead of the probabilities of individual tokens. We investigate sentence-level contextual entrainment across 26 LLMs from seven families and two datasets, which cover both subjective and objective tasks. We find that sentence-level contextual entrainment exists. This means that the sentences in the prompt (even if they are counterfactual statements) can significantly increase their probability during model inference time. As the model size increases, contextual entrainment gradually decreases. We also find that contextual entrainment is controlled by 2% to 4% of the attention heads. Turning off these attention heads can effectively mitigate contextual entrainment without hurting the model's performance.
大型语言模型中的句子级语境适应现象 / Sentence-Level Contextual Entrainment in Large Language Models
这项研究发现,大型语言模型存在一种“句子级语境适应”现象:当提示语中包含特定句子(即使是虚构的),模型在生成时会不自觉地提高该句子中词语的出现概率;并且模型越大,这种现象越弱,且仅需关闭2%-4%的注意力头就能有效抑制该现象而不影响模型性能。
源自 arXiv: 2606.24077