多平面HyperX:面向大规模AI和高性能计算系统的低延迟、高性价比网络 / Multi-Plane HyperX: A Low-Latency and Cost-Effective Network for Large-Scale AI and HPC Systems
1️⃣ 一句话总结
本文首次将多平面技术应用于HyperX直连网络架构,并验证了该结构相比现有的多平面Fat-Tree、Dragonfly等主流网络,能显著降低网络直径、提升性价比,从而更适合大规模AI和高性能计算系统。
Multi-plane architectures have become increasingly prevalent in the Fat-Tree networks of AI data centers. By leveraging multiple ports on a single network interface card (NIC) or multiple NICs within a scale-up domain, each port or NIC is allocated to an independent network plane, thereby provisioning the overall system with multiple network planes. However, no prior literature has explored the application of multi-plane technologies to direct networks such as HyperX. This paper investigates the multi-plane HyperX network and demonstrates that, compared to state-of-the-art network topologies like multi-plane Fat-Tree, Dragonfly, and Dragonfly+, the multi-plane HyperX architecture achieves a significantly smaller network diameter and superior cost-effectiveness.
多平面HyperX:面向大规模AI和高性能计算系统的低延迟、高性价比网络 / Multi-Plane HyperX: A Low-Latency and Cost-Effective Network for Large-Scale AI and HPC Systems
本文首次将多平面技术应用于HyperX直连网络架构,并验证了该结构相比现有的多平面Fat-Tree、Dragonfly等主流网络,能显著降低网络直径、提升性价比,从而更适合大规模AI和高性能计算系统。
源自 arXiv: 2604.23519