SAME:用于多模态持续指令微调的稳定专家混合模型 / SAME: Stabilized Mixture-of-Experts for Multimodal Continual Instruction Tuning
1️⃣ 一句话总结
本文提出了一种名为SAME的新方法,通过稳定专家选择和更新过程,有效解决了多模态大语言模型在持续学习新任务时出现的性能遗忘和干扰问题,从而在无需重复训练旧数据的情况下,实现了更稳定、高效的能力扩展。
Multimodal Large Language Models (MLLMs) achieve strong performance through instruction tuning, but real-world deployment requires them to continually expand their capabilities, making Multimodal Continual Instruction Tuning (MCIT) essential. Recent methods leverage sparse expert routing to promote task specialization, but we find that the expert routing process suffers from drift as the data distribution evolves. For example, a grounding query that previously activated localization experts may instead be routed to irrelevant experts after learning OCR tasks. Meanwhile, the grounding-related experts can be overwritten by new tasks and lose their original functionality. Such failure reflects two problems: router drift, where expert selection becomes inconsistent over time, and expert drift, where shared experts are overwritten across tasks. Therefore, we propose StAbilized Mixture-of-Experts (SAME) for MCIT. To address router drift, SAME stabilizes expert selection by decomposing routing dynamics into orthogonal subspaces and updating only task-relevant directions. To mitigate expert drift, we regulate expert updates via curvature-aware scaling using historical input covariance in a rehearsal-free manner. SAME also introduces adaptive expert activation to freeze selected experts during training, reducing redundant computation and cross-task interference. Extensive experiments demonstrate its SOTA performance.
SAME:用于多模态持续指令微调的稳定专家混合模型 / SAME: Stabilized Mixture-of-Experts for Multimodal Continual Instruction Tuning
本文提出了一种名为SAME的新方法,通过稳定专家选择和更新过程,有效解决了多模态大语言模型在持续学习新任务时出现的性能遗忘和干扰问题,从而在无需重复训练旧数据的情况下,实现了更稳定、高效的能力扩展。
源自 arXiv: 2602.01990