Part #/ Keyword
All Products

NVIDIA Drops Four-Chiplet Rubin Ultra

2026-07-01 11:01:48Mr.Ming
twitter photos
twitter photos
twitter photos
NVIDIA Drops Four-Chiplet Rubin Ultra

According to industry reports, NVIDIA has reportedly revised the design of its next-generation Rubin Ultra AI accelerator, abandoning the originally planned four-GPU compute chiplet configuration in favor of a more manufacturable dual-GPU architecture. The change is said to address production and packaging challenges associated with the more ambitious design, although NVIDIA has not officially confirmed the reports.

Rubin Ultra was initially expected to be one of NVIDIA’s most aggressive data center GPU projects, targeting a 2027 launch. Compared with the standard Rubin accelerator, which is designed around two compute chiplets, the original four-chiplet Rubin Ultra architecture was expected to deliver roughly double the performance. However, the approach would have significantly increased design complexity, advanced packaging requirements, and thermal management challenges.

Industry sources indicate that integrating four large compute dies—each approaching reticle-size limits—into a single package using current advanced packaging technologies presents substantial manufacturing hurdles. In addition, efficiently cooling four high-performance compute chiplets alongside 16 stacks of HBM4E high-bandwidth memory would add considerable cost and engineering complexity. As a result, NVIDIA is reportedly shifting to a dual-chiplet design that is considered more practical for mass production.

If the reported redesign is accurate, the theoretical peak performance of the new Rubin Ultra could be lower than that of the original four-chiplet concept. This may narrow NVIDIA’s projected performance advantage over competing AI accelerators, including AMD’s forthcoming Instinct MI500 series. Nevertheless, NVIDIA could offset some of the performance reduction through architectural improvements, software optimization, and system-level enhancements.

The Rubin Ultra accelerator is still expected to adopt HBM4E memory technology, while the first-generation Rubin platform is anticipated to use HBM4. Beyond individual GPUs, NVIDIA’s broader strategy focuses on scaling AI infrastructure at the system level. Beginning with the Rubin generation, the company is expected to introduce its liquid-cooled Kyber rack-scale platform, increasing the number of GPUs within a single scaling domain to at least 144 packages. This approach aims to deliver significantly greater AI computing capacity for hyperscale and enterprise deployments.

The reported cancellation of the four-chiplet Rubin Ultra design could also have implications for the high-bandwidth memory market. The original configuration was expected to utilize 16 HBM4E memory stacks per accelerator, while the revised dual-chiplet version is reportedly designed with only eight HBM4E stacks. Such a reduction could affect future demand projections for advanced memory products.

From a cost perspective, the dual-chiplet Rubin Ultra is expected to be less expensive to manufacture than the previously planned four-chiplet variant. However, the impact on customer spending remains uncertain. NVIDIA increasingly markets complete rack-scale AI systems rather than standalone GPUs, meaning organizations may need to deploy more systems to achieve the same aggregate GPU count and computing performance. As a result, lower per-GPU production costs may not necessarily translate into lower overall infrastructure investment.


* Solemnly declare: The copyright of this article belongs to the original author. The reprinted article is only for the purpose of disseminating more information. If the author's information is marked incorrectly, please contact us to modify or delete it as soon as possible. Thank you for your attention!