The new NVIDIA Rubin CPX GPU provides 30 PETAFLOPS Compute and 128 GB of memory for inference

Nvidia announces the GPU Rubin CPX with 128 GB of memory built for workloads of corporate AI
Vera Rubin NVL144 CPX Rack provides 8 Exaflops Compute and 100 to quick memory
The expeditions planned for the end of 2026 with Rubin Ultra and Feynman already on the roadmap

Nvidia has announced a brand new GPU built on Rubin architecture and designed for long -context AI workloads.

Rubin CPX, as we know, includes 128 GB of GDDR7 memory, making it the first GPU of the company as such.

There were rumors on a 128 GB RTX game card, but it’s not 100%. This GPU is a calculation engine intended for inference in fields such as software development, research and generation of high definition videos. It will not work Metal Gear Solid Delta: Snake Eater Anytime soon.

Vera Rubin NVL144 CPX Rack

The GPU offers up to 30 Petaflops of NVFP4 calculation and incorporates the acceleration of material attention which, according to Nvidia, is three times faster than the GB300 NVL72.

It also incorporates four NVENC and four NVDEC units to speed up video workflows.

As part of Nvidia’s wider thrust towards disaggregated inference, Rubin CPX is designed to manage the heavy calculation context phase, while other generation tasks of CPU Rubin GPU and Vera addresses.

By concentrating Rubin CPX on context processing tasks, NVIDIA aims to improve flow while reducing the costs of deployment of high value.

NVIDIA Dynamo software will manage things behind the scenes, putting on the low latency cache transfers and the routing between the components.

The largest company deployment model is the Rack Vera Rubin NVL144 CPX. Each unit includes 144 GPU Rubin CPX, 144 GPU Rubin and 36 CPU Vera.

Together, they deliver 8 NVFP4 calculation exaflops, 100 TB of high speed memory and 1.7 pb / s of memory bandwidth.

Quantum-X800 Infiniband or Spectrum-X Ethernet with Connectx-9 Supernics provide connectivity.

The expeditions of Rubin CPX and Racks NVL144 CPX are currently in pencil for the end of 2026, after the recent TSMC band.

The Nvidia roadmap includes Rubin Ultra, now expected in 2027, and Feynman, scheduled for 2028.

These conceptions will extend the Rubin architecture with higher density modules, HBM4E memory and faster networking.

Via Video

(Image credit: nvidia)

Must Read

Leave a Comment Cancel Reply