- Huawei wants UB-Mashi to unify fragmented interconnection standards through massive IA clusters
- The UB-MOSH design mixes the enclosed spine with multidimensional stitches with rack for scalability
- Traditional interconnections become too expensive for large -scale deployments
Huawei has revealed for open source its UB-Mash interconnection, a system aimed at unifying how processors, memory and networking equipment communicate in massive AI data centers.
The UB-MOSH design combines a spine based on reports at the level of the data room with multidimensional meshes inside each rack.
By combining these topologies, Huawei claims that it can keep the costs under control, even if the system sizes evolve in tens of thousands of nodes. He also hopes to solve the problem of scaling AI workloads, where latency and equipment failures set barriers.
Replacement of standards fragmented by a single framework
This decision is launched as a means of replacing several overlapping standards with a single framework, which could reshape the functioning of the IT infrastructure on a large scale.
In simple terms, Huawei wants to replace today’s mixture of different connection rules with a universal system, so everything is easily linked and at a lower cost.
“Next month, we have a conference, where we are going to announce that the UB Mesh protocol will be published and disclosed to anyone as a free license,” said Heng Liao, chief scientist of Huawei.
“This is very new technology; we see competing standardization efforts from different camps. […] According to our success in deploying real systems and the demand of partners and customers, we can speak of transforming it into a kind of standard. “”
One of the central arguments behind UB-Mash is that traditional interconnections become too expensive on a large scale, which ultimately costs more than the accelerators they are supposed to connect.
Huawei highlights its own demonstrations, where a deployment of 8,192 nodes was used as proof that costs do not need to get up linearly.
This is considered essential for the future of AI systems built with millions of processors, high -speed networking devices and massive storage networks such as the largest SSD systems used in cloud storage operations.
UB-Mash is part of a wider idea that Huawei calls the supernod. This refers to a cluster at the level of the data center where processors, GPUs, memory, SSD units and switches can all work as if they were in a single machine.
Bandida claims of more than one teraoctet per second per device and latency lower than micro-dose are positioned as proof that the concept is not only possible but necessary for new generation IT.
However, standards like PCIE, NVLink, Ualink and Ultra Ethernet already have the support of several companies through semiconductors and networking industries.
The question is now whether the industry will accept a new protocol supported by Huawei or will continue to promote standards already supported by a wider range of companies.
Huawei’s proposal, although ambitious, places customers in the position of adopting a protocol held and controlled by a supplier.
Even with open source licenses, there are concerns about long -term interoperability, governance and geopolitical risks.
That said, Huawei’s technical potential seems impressive, but its decision requires a certain degree of confidence and adoption at the industry level which it has not yet secured.
Via Toms equipment