指令锚点:剖析多模态仲裁的因果动态机制 / Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration
1️⃣ 一句话总结
这篇论文揭示了多模态大语言模型如何根据用户指令选择性地利用图像或文本信息的内在机制,发现少数关键的注意力头主导了这一决策过程,并且通过微小的干预就能显著改变模型的行为。
Modality following serves as the capacity of multimodal large language models (MLLMs) to selectively utilize multimodal contexts based on user instructions. It is fundamental to ensuring safety and reliability in real-world deployments. However, the underlying mechanisms governing this decision-making process remain poorly understood. In this paper, we investigate its working mechanism through an information flow lens. Our findings reveal that instruction tokens function as structural anchors for modality arbitration: Shallow attention layers perform non-selective information transfer, routing multimodal cues to these anchors as a latent buffer; Modality competition is resolved within deep attention layers guided by the instruction intent, while MLP layers exhibit semantic inertia, acting as an adversarial force. Furthermore, we identify a sparse set of specialized attention heads that drive this arbitration. Causal interventions demonstrate that manipulating a mere $5\%$ of these critical heads can decrease the modality-following ratio by $60\%$ through blocking, or increase it by $60\%$ through targeted amplification of failed samples. Our work provides a substantial step toward model transparency and offers a principled framework for the orchestration of multimodal information in MLLMs.
指令锚点:剖析多模态仲裁的因果动态机制 / Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration
这篇论文揭示了多模态大语言模型如何根据用户指令选择性地利用图像或文本信息的内在机制,发现少数关键的注意力头主导了这一决策过程,并且通过微小的干预就能显著改变模型的行为。
源自 arXiv: 2602.03677