迷失于提示顺序:揭示语言模型中因果注意力的局限性 / Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models
1️⃣ 一句话总结
这篇论文发现,大语言模型在回答选择题时,将背景信息放在问题和选项之前,比反过来排列能显著提升准确率,其根本原因在于模型内部的因果注意力机制会阻止选项去‘看到’背景信息,从而造成信息瓶颈。
Large language models exhibit surprising sensitivity to the structure of the prompt, but the mechanisms underlying this sensitivity remain poorly understood. In this work, we conduct an in-depth investigation on a striking case: in multiple-choice question answering, placing context before the questions and options (CQO) outperforms the reverse order (QOC) by over 14%p, consistently over a wide range of models and datasets. Through systematic architectural analysis, we identify causal attention as the core mechanism: in QOC prompts, the causal mask prevents option tokens from attending to context, creating an information bottleneck where context becomes invisible to options.
迷失于提示顺序:揭示语言模型中因果注意力的局限性 / Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models
这篇论文发现,大语言模型在回答选择题时,将背景信息放在问题和选项之前,比反过来排列能显著提升准确率,其根本原因在于模型内部的因果注意力机制会阻止选项去‘看到’背景信息,从而造成信息瓶颈。
源自 arXiv: 2601.14152