Part #/ Keyword
All Products

Google Unveils 7th-Gen TPU with 4614 TFLOPs

2025-04-10 13:48:03Mr.Ming
twitter photos
twitter photos
twitter photos
Google Unveils 7th-Gen TPU with 4614 TFLOPs

On April 9, 2025, during the "Google Cloud Next 25" event held in the United States, Google officially launched its seventh-generation Tensor Processing Unit (TPU), "Ironwood," designed specifically for AI workloads. The Ironwood TPU is the first of its kind, tailored for inference tasks, powering both thinking and inferential models, with a peak computing power of 4,614 TFLOPs per chip.

Ironwood introduces significant advancements in processing power and architecture. It is the first Google TPU to support FP8 computation in its tensor cores and matrix math units, surpassing the previous generation, which only supported INT8 and BF16 formats for inference and training tasks, respectively. Additionally, Ironwood features the third-generation SparseCore accelerator, which debuted in the TPU v5p and was enhanced in last year's Trillium chip. SparseCore is designed to accelerate recommendation models and is now optimized for financial and scientific computations, though specific algorithmic details remain undisclosed.

The Ironwood TPU also boasts a massive increase in High Bandwidth Memory (HBM) capacity, reaching 192GB per chip, six times more than the previous Trillium chips. This enhancement allows for the processing of larger models and datasets, reducing the need for frequent data transfers and boosting overall performance. The HBM bandwidth is also significantly upgraded, with each chip delivering a memory bandwidth of 7.2 Tbps, 4.5 times that of Trillium, ensuring fast data retrieval and storage.

In terms of inter-chip connectivity, the Ironwood features an improved Inter-Chip Interconnect (ICI) bandwidth, with bidirectional transfer rates increased to 1.2 Tbps—1.5 times faster than Trillium—facilitating faster communication between chips and enhancing the efficiency of large-scale distributed training and inference.

Beyond performance, Google has focused on improving energy efficiency with Ironwood. Compared to the sixth-generation Trillium TPU, Ironwood delivers twice the performance per watt. With advanced liquid cooling solutions and chip design optimizations, Ironwood maintains nearly twice the performance of standard air-cooled systems, even under heavy AI workloads.

Ironwood is available in two configurations to meet different AI workload needs: one optimized for inference with 256 compute engines and another for training with 9,216 compute engines. When scaled up to 9,216 chips per pod, Ironwood can deliver up to 42.5 exaflops of computing power—more than 24 times the performance of the world's largest supercomputer, El Capitan, which offers only 1.7 exaflops per pod.

Designed to handle the most demanding AI workloads, Ironwood is particularly suitable for training and inference of large, dense models such as large language models (LLMs) or mixture-of-experts (MoE) models. It also features an enhanced SparseCore accelerator to support large-scale embedding tasks for advanced ranking and recommendation systems.

Google DeepMind's Pathways framework further enhances the power of Ironwood, enabling efficient distributed computing across multiple TPUs. Available on Google Cloud, Pathways allows clients to surpass the limits of a single Ironwood pod by aggregating hundreds of thousands of chips, driving rapid advancements in AI.

With the AI chip market highly competitive, Google's new Ironwood TPU joins offerings from tech giants like NVIDIA, Amazon, and Microsoft. Amazon has its AWS Trainium, Inferentia chips, and Graviton processors, while Microsoft offers its Maia 100 and Cobalt 100 chips. The launch of Ironwood further strengthens Google Cloud’s position in the rapidly growing AI landscape.

* Solemnly declare: The copyright of this article belongs to the original author. The reprinted article is only for the purpose of disseminating more information. If the author's information is marked incorrectly, please contact us to modify or delete it as soon as possible. Thank you for your attention!