菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-24
📄 Abstract - Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning

Machine learning models often need to adapt to new data after deployment due to structured or unstructured real-world dynamics. The Continual Learning (CL) framework enables continuous model adaptation, but most existing approaches either assume each task contains sufficiently many data samples or that the learning tasks are non-overlapping. In this paper, we address the more general setting where each task may have a limited dataset, and tasks may overlap in an arbitrary manner without a priori knowledge. This general setting is substantially more challenging for two reasons. On the one hand, data scarcity necessitates effective contextualization of general knowledge and efficient knowledge transfer across tasks. On the other hand, unstructured task overlapping can easily result in negative knowledge transfer. To address the above challenges, we propose an adaptive mixture-of-experts (MoE) framework over pre-trained models that progressively establishes similarity awareness among tasks. Our design contains two innovative algorithmic components: incremental global pooling and instance-wise prompt masking. The former mitigates prompt association noise through gradual prompt introduction over time. The latter decomposes incoming task samples into those aligning with current prompts (in-distribution) and those requiring new prompts (out-of-distribution). Together, our design strategically leverages potential task overlaps while actively preventing negative mutual interference in the presence of per-task data scarcity. Experiments across varying data volumes and inter-task similarity show that our method enhances sample efficiency and is broadly applicable.

顶级标签: machine learning model training systems
详细标签: continual learning mixture of experts data scarcity knowledge transfer prompt tuning 或 搜索:

面向数据高效持续学习的相似性感知专家混合模型 / Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning


1️⃣ 一句话总结

这篇论文提出了一种基于预训练模型的智能专家混合框架,它通过渐进式学习任务间的相似性,在数据有限且任务可能重叠的复杂场景下,有效提升了模型持续学习新知识时的样本利用效率,同时避免了任务间的负面干扰。

源自 arXiv: 2603.23436