NVIDIA announced on Tuesday, September 9, the launch of its new Rubin CPX GPU, a processor built specifically for long-context reasoning and video generation. This chip is designed to dramatically boost efficiency in AI workloads such as programming and advanced video creation, where models require handling ultra-large context windows.
Jensen Huang, NVIDIA's founder and CEO, said, "Just as RTX transformed graphics and physics AI, Rubin CPX is the first CUDA GPU created for massive-context AI models capable of reasoning across millions of knowledge tokens at once."
Rubin itself is NVIDIA's next-generation high-performance chip scheduled for release next year, while Rubin CPX shipments are expected toward the end of 2026. The upcoming flagship AI server, officially named NVIDIA Vera Rubin NVL144 CPX, combines 36 Vera CPUs, 144 Rubin GPUs, and 144 Rubin CPX GPUs into a single platform.
Rubin CPX comes with 128GB of GDDR7 memory and delivers up to 30 PFLOPS of AI compute at NVFP4 precision. It's optimized for tasks requiring more than one million tokens in context as well as complex video generation.
The Vera Rubin NVL144 CPX system packs enormous power, supporting 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs per rack. It reaches 8 EFLOPS of AI performance (NVFP4 precision) with 100TB of fast memory and bandwidth of 1.7PB/s.
That performance is more than twice the capability of the existing Vera Rubin NVL144 platform and 7.5 times stronger than the GB300 NVL72 system built on Blackwell Ultra. It also delivers three times faster attention mechanisms compared to the GB300 NVL72, making it one of the most powerful AI systems ever revealed.