菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-25
📄 Abstract - Learning to Recover Task Experts from a Multi-Task Merged Model

Multi-task model merging aims to consolidate several task-specific experts into a unified model, yet static merging consistently suffers from parameter interference. While dynamic merging models aim to bridge this gap, many works rely on the costly storage and loading of redundant expert components at inference. In this work, from the perspective of task expert, we view parameter interference as parameter perturbation introduced to each expert during merging process. We show that such parameter perturbations can be modeled as affine transformation, which can be approximated as additive offsets. Motivated by these, we propose Recover Task eXpert (ReTeX), a framework that predicts those offsets, in order to undo parameter interference and recover task-expert performance from a single merged checkpoint. To recover the appropriate expert when task identity is unknown, we introduce a router-free task identifier based on SVD subspace signatures computed offline before inference. At inference, the identifier selects the task whose subspace yields the smallest projection residual for a given input. As a result, ReTeX recovers over 95% of individual-expert performance in both vision and NLP domains, while significantly improving generalization to unseen tasks. Crucially, we also show that the parameter offset prediction leads to emergent adaptive interpolation of expert knowledge for out-of-distribution (OOD) tasks. ReTeX adaptively interpolates seen expert knowledge to handle unseen tasks. Our code is available at this https URL

顶级标签: machine learning multi-modal
详细标签: multi-task merging parameter interference expert recovery task identification affine transformation 或 搜索:

从多任务合并模型中恢复任务专家模型 / Learning to Recover Task Experts from a Multi-Task Merged Model


1️⃣ 一句话总结

本文提出了一种称为ReTeX的框架,通过预测并消除多任务模型合并过程中引入的参数干扰(类似添加噪声的偏移量),从而从单个合并检查点中高效恢复出各个任务的原始专家性能,并利用SVD子空间签名实现无路由器的任务识别,大幅提升了对未见任务的泛化能力。

源自 arXiv: 2606.26902