RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

📄 Abstract - RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

Mixture-of-Agents (MoA) improves LLM performance through layered collaboration, but its dense topology raises costs and latency. Existing methods employ LLM judges to filter responses, yet still require all models to perform inference before judging, failing to cut costs effectively. They also lack model selection criteria and struggle with large model pools, where full inference is costly and can exceed context limits. To address this, we propose RouteMoA, an efficient mixture-of-agents framework with dynamic routing. It employs a lightweight scorer to perform initial screening by predicting coarse-grained performance from the query, narrowing candidates to a high-potential subset without inference. A mixture of judges then refines these scores through lightweight self- and cross-assessment based on existing model outputs, providing posterior correction without additional inference. Finally, a model ranking mechanism selects models by balancing performance, cost, and latency. RouteMoA outperforms MoA across varying tasks and model pool sizes, reducing cost by 89.8% and latency by 63.6% in the large-scale model pool.

RouteMoA：无需预推理的动态路由提升高效混合智能体性能 / RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

1️⃣ 一句话总结

这篇论文提出了一种名为RouteMoA的高效混合智能体框架，它通过一个轻量级评分器预先筛选模型，并结合评估与排名机制动态选择模型，从而在保持性能的同时大幅降低了计算成本和延迟。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要