📄
Abstract - InduceKV: Fixed-Footprint Continual Adaptation of Multimodal LLMs via Inducing KV Memories
Multimodal large language models must adapt to evolving tasks and domains, yet continual improvement under bounded deployment footprint remains difficult because repeated parameter updates or growing replay stores can accumulate adaptation state over time. We study fixed-footprint continual adaptation: the deployed adaptation state is kept under a fixed memory budget, while the backbone model is left unchanged and task-specific updates are externalized. We propose InduceKV, a retrieval-based method that stores each selected training prefix as an attention-ready memory entry, consisting of a frozen retrieval key and compact layerwise key--value (KV) payloads that can be appended to the model's self-attention cache. Under a strict memory budget, InduceKV constructs a compact inducing set through bilevel selection: a lightweight calibration is fit for retrieval, while the selected memory balances current-task likelihood, anchor-based retention, and coverage in the frozen retrieval space. Across task-incremental instruction tuning, continual VQA, domain-incremental adaptation, and lifelong multimodal instruction tuning, InduceKV consistently improves over PEFT, MoE, replay, and prompt-retrieval baselines under matched memory budgets. We further report backbone-matched, stage-1 CoIN, compute-matched, and scalability diagnostics, showing that the gains are not due to a stronger backbone, replay alone, or an unbounded candidate pool.
InduceKV:通过诱导KV记忆实现多模态大语言模型的固定资源连续适配 /
InduceKV: Fixed-Footprint Continual Adaptation of Multimodal LLMs via Inducing KV Memories
1️⃣ 一句话总结
本文提出一种名为InduceKV的方法,让多模态大语言模型在固定内存预算下持续学习新任务,无需改动模型本身或无限制存储旧数据,而是通过智能挑选少量关键训练样本,将其转化为可直接存取的记忆单元,从而在多种连续学习场景中显著优于传统的参数微调、混合专家模型和数据回放等方法。