TAG-MoE:面向统一生成模型的专家混合任务感知门控机制 / TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
1️⃣ 一句话总结
这篇论文提出了一种让专家混合模型能‘看懂任务’的新方法,通过给任务添加语义标签并引导模型内部路由与之对齐,有效解决了统一图像生成与编辑模型中不同任务相互干扰的问题,从而提升了生成效果。
Unified image generation and editing models suffer from severe task interference in dense diffusion transformers architectures, where a shared parameter space must compromise between conflicting objectives (e.g., local editing v.s. subject-driven generation). While the sparse Mixture-of-Experts (MoE) paradigm is a promising solution, its gating networks remain task-agnostic, operating based on local features, unaware of global task intent. This task-agnostic nature prevents meaningful specialization and fails to resolve the underlying task interference. In this paper, we propose a novel framework to inject semantic intent into MoE routing. We introduce a Hierarchical Task Semantic Annotation scheme to create structured task descriptors (e.g., scope, type, preservation). We then design Predictive Alignment Regularization to align internal routing decisions with the task's high-level semantics. This regularization evolves the gating network from a task-agnostic executor to a dispatch center. Our model effectively mitigates task interference, outperforming dense baselines in fidelity and quality, and our analysis shows that experts naturally develop clear and semantically correlated specializations.
TAG-MoE:面向统一生成模型的专家混合任务感知门控机制 / TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
这篇论文提出了一种让专家混合模型能‘看懂任务’的新方法,通过给任务添加语义标签并引导模型内部路由与之对齐,有效解决了统一图像生成与编辑模型中不同任务相互干扰的问题,从而提升了生成效果。
源自 arXiv: 2601.08881