菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-06
📄 Abstract - The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Mixture of Experts models are widely assumed to achieve domain specialization through sparse routing. In this work, we question this assumption by introducing COMMITTEEAUDIT, a post hoc framework that analyzes routing behavior at the level of expert groups rather than individual experts. Across three representative models and the MMLU benchmark, we uncover a domain-invariant Standing Committee. This is a compact coalition of routed experts that consistently captures the majority of routing mass across domains, layers, and routing budgets, even when architectures already include shared experts. Qualitative analysis further shows that Standing Committees anchor reasoning structure and syntax, while peripheral experts handle domain-specific knowledge. These findings reveal a strong structural bias toward centralized computation, suggesting that specialization in Mixture of Experts models is far less pervasive than commonly believed. This inherent bias also indicates that current training objectives, such as load-balancing losses that enforce uniform expert utilization, may be working against the model's natural optimization path, thereby limiting training efficiency and performance.

顶级标签: model training theory model evaluation
详细标签: mixture-of-experts routing analysis specialization structural bias training efficiency 或 搜索:

专业化的幻象:揭示混合专家模型中领域不变的“常务委员会” / The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models


1️⃣ 一句话总结

这篇论文通过分析多个模型发现,混合专家模型内部存在一个核心专家小组,它在处理各种不同领域的问题时都占据主导地位,这表明模型的实际专业化程度远低于人们通常的假设,并可能意味着当前追求专家均衡利用的训练方法反而限制了模型性能。

源自 arXiv: 2601.03425