ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation

📄 Abstract - ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation

Model merging aims to combine multiple task-specific expert models into a single model while preserving generalization across diverse tasks. However, interference among experts, especially when they are trained on different objectives, often leads to significant performance degradation. Despite recent progress, resolving this interference without data access, retraining, or architectural modification remains a fundamental challenge. This paper provides a theoretical analysis demonstrating that the input covariance of each task, which is a key factor for optimal merging, can be implicitly estimated from the parameter differences of its fine-tuned model, even in a fully data-free setting. Building on this insight, we introduce \acem, an Adaptive Covariance Estimation framework that effectively mitigates inter-task interference. Our approach features a principled, closed-form solution that contrasts with prior iterative or heuristic methods. Extensive experiments on both vision and language benchmarks demonstrate that \acem sets a new state-of-the-art among data-free methods. It consistently outperforms existing baselines; for example, \acem achieves an average absolute improvement of 4\% over the previous methods across seven tasks on GPT-2. Owing to its efficient closed-form formulation, \acem delivers superior performance with a modest computational cost, providing a practical and theoretically grounded solution for model merging.

ACE-Merging：基于自适应协方差估计的无数据模型融合 / ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation

1️⃣ 一句话总结

这篇论文提出了一种无需原始数据、无需重新训练或修改模型结构的新方法，通过自适应估计不同任务间的统计关系来有效融合多个专家模型，从而在保持各任务性能的同时显著减少模型间的干扰，并在视觉和语言任务上取得了当前最好的效果。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要