- Nvidia Rubin DGX SuperPOD delivers 28.8 exaflops with just 576 GPUs
- Each NVL72 system combines 36 Vera CPUs, 72 Rubin GPUs and 18 DPUs
- NVLink overall throughput reaches 260 TB/s per DGX rack for greater efficiency
At CES 2026, Nvidia unveiled its next-generation DGX SuperPOD powered by the Rubin Platform, a system designed to deliver extreme AI computing in dense, integrated racks.
According to the company, the SuperPOD integrates multiple Vera Rubin NVL72 or NVL8 systems into a single cohesive AI engine, supporting large-scale workloads with minimal infrastructure complexity.
With liquid-cooled modules, high-speed interconnects and unified memory, the system targets institutions seeking maximum AI throughput and reduced latency.
Rubin-based computation architecture
Each DGX Vera Rubin NVL72 system includes 36 Vera CPUs, 72 Rubin GPUs, and 18 BlueField 4 DPUs, delivering a combined FP4 performance of 50 petaflops per system.
Overall NVLink throughput reaches 260 TB/s per rack, allowing all memory and compute space to operate as a single, cohesive AI engine.
The Rubin GPU integrates a third-generation Transformer Engine and hardware-accelerated compression, enabling inference and training workloads to be processed efficiently at scale.
Connectivity is enhanced by Spectrum-6, Quantum-X800 InfiniBand Ethernet switches and ConnectX-9 SuperNICs, which support high-speed deterministic AI data transfer.
Nvidia’s SuperPOD design emphasizes end-to-end network performance, ensuring minimal congestion in large AI clusters.
Quantum-X800 InfiniBand delivers low latency and high throughput, while Spectrum-X Ethernet efficiently handles east-west AI traffic.
Each DGX rack integrates 600 TB of fast memory, NVMe storage, and integrated AI context memory to support training and inference pipelines.
The Rubin platform also integrates advanced software orchestration through Nvidia Mission Control, streamlining cluster operations, automated recovery, and infrastructure management for large AI factories.
A DGX SuperPOD with 576 Rubin GPUs can achieve 28.8 FP4 Exaflops, while individual NVL8 systems deliver 5.5x higher FP4 FLOPS than previous Blackwell architectures.
In comparison, Huawei’s Atlas 950 SuperPod claims 16 FP4 Exaflops per SuperPod, meaning Nvidia achieves higher efficiency per GPU and requires fewer units to achieve extreme computing levels.
Rubin-based DGX clusters also use fewer nodes and cabinets than Huawei’s SuperCluster, which scales to thousands of NPUs and several petabytes of memory.
This performance density allows Nvidia to directly compete with Huawei’s projected compute performance while limiting space, power, and interconnect fees.
The Rubin Platform unifies AI compute, networking, and software into a single stack.
Nvidia AI Enterprise software, NIM microservices, and mission-critical orchestration create a cohesive environment for long-term contextual reasoning, agentic AI, and multi-modal model deployment.
While Huawei scales primarily through hardware count, Nvidia emphasizes rack-level efficiency and tightly integrated software controls, which can reduce operational costs for industrial-scale AI workloads.
TechRadar will cover this year’s events extensively THESEand will bring you all the big announcements as they happen. Visit our CES 2026 News for the latest stories and our hands-on verdicts on everything from wireless TVs and foldable displays to new phones, laptops, smart home gadgets and the latest in AI. You can also ask us a question about the show in our Live Q&A from CES 2026 and we will do our best to answer them.
And don’t forget to follow us on TikTok And WhatsApp for the latest news from the CES show!




