UB-Mesh:一种分层局部化的nD-FullMesh数据中心网络架构

UB-Mesh: a Hierarchically Localized nD-FullMesh Datacenter Network Architecture

摘要 Abstract

随着大规模语言模型(LLMs)的持续扩展,所需的计算能力和带宽不断增加。为了解决这一问题,我们引入了UB-Mesh,这是一种新型的人工智能数据中心网络架构,旨在提升可扩展性、性能、成本效率和可用性。与传统提供对称节点间带宽的数据中心不同,UB-Mesh采用了一种分层局部化的nD-FullMesh网络拓扑结构。该设计充分利用了LLM训练的数据局部性,优先考虑短距离直接互连以减少数据移动距离并降低交换机使用量。尽管UB-Mesh的nD-FullMesh拓扑具有多个理论优势,但其具体架构设计、物理实现以及网络系统优化带来了新的挑战。在UB-Mesh的实际构建中,我们首先设计了基于4D-FullMesh拓扑的UB-Mesh-Pod架构。UB-Mesh-Pod通过一系列硬件组件得以实现,这些组件包括专门设计的神经处理单元(NPU)、中央处理器(CPU)、低阶交换机(LRS)、高阶交换机(HRS)、网络接口卡(NIC)等,它们作为基础构建块相互连接,并通过一种名为统一总线(UB)的新技术实现灵活的IO带宽分配和硬件资源池化。在网络系统优化方面,我们提出了先进的路由机制——全路径路由(APR),以高效管理数据流量。这些优化措施,结合拓扑感知的性能增强和如64+1备份设计等可靠的可靠性措施,使UB-Mesh的成本效率提高了2.04倍,网络可用性比传统Clos架构提高了7.2%,并在各种LLM训练任务中实现了95%以上的线性度。

As the Large-scale Language Models (LLMs) continue to scale, the requisite computational power and bandwidth escalate. To address this, we introduce UB-Mesh, a novel AI datacenter network architecture designed to enhance scalability, performance, cost-efficiency and availability. Unlike traditional datacenters that provide symmetrical node-to-node bandwidth, UB-Mesh employs a hierarchically localized nD-FullMesh network topology. This design fully leverages the data locality of LLM training, prioritizing short-range, direct interconnects to minimize data movement distance and reduce switch usage. Although UB-Mesh's nD-FullMesh topology offers several theoretical advantages, its concrete architecture design, physical implementation and networking system optimization present new challenges. For the actual construction of UB-Mesh, we first design the UB-Mesh-Pod architecture, which is based on a 4D-FullMesh topology. UB-Mesh-Pod is implemented via a suite of hardware components that serve as the foundational building blocks, including specifically-designed NPU, CPU, Low-Radix-Switch (LRS), High-Radix-Switch (HRS), NICs and others. These components are interconnected via a novel Unified Bus (UB) technique, which enables flexible IO bandwidth allocation and hardware resource pooling. For networking system optimization, we propose advanced routing mechanism named All-Path-Routing (APR) to efficiently manage data traffic. These optimizations, combined with topology-aware performance enhancements and robust reliability measures like 64+1 backup design, result in 2.04x higher cost-efficiency, 7.2% higher network availability compared to traditional Clos architecture and 95%+ linearity in various LLM training tasks.

UB-Mesh:一种分层局部化的nD-FullMesh数据中心网络架构 - arXiv