May 15, 2024 - Google I/O 2024, the annual developer conference, kicked off in the early hours of Beijing time with the unveiling of the sixth-generation TPU (Tensor Processing Unit) chip. Simultaneously, Google announced that the Android system would also step into the Gemini era.
Alphabet, Google's parent company, introduced Trillium to its lineup of artificial intelligence (AI) data center chips, boasting a speed increase of nearly 5 times compared to previous versions.
The introduction of Trillium by Alphabet signifies a concerted effort to build custom chips for AI data centers, presenting one of the few viable alternatives to challenge the market dominance of NVIDIA's top-tier processors. The close relationship between Google's Tensor Processing Unit (TPU) and software allows Google to capture a significant market share.
Currently, NVIDIA holds approximately 80% of the market share in AI data center chips, with the remaining 20% primarily consisting of various versions of Google's TPU. Google itself does not sell chips but rather rents access through its cloud computing platform.
Google stated that compared to TPU v5e, the sixth-generation Trillium chip's computational performance will increase by 4.7 times, aiming to support technologies ranging from large-scale text generation to other media. The energy efficiency of the Trillium processor is 67% higher than TPU v5e.
Moreover, Trillium is equipped with the third-generation SparseCore, a dedicated accelerator for handling large embeddings, used for advanced ranking and recommendation workloads.
Trillium can scale up to 256 TPUs within a single high-bandwidth, low-latency Pod. In addition to pod-level scalability, Trillium, with Multislice technology and Titanium Intelligence Processing Unit, can scale to hundreds of pods.
Trillium will assist Google in training next-generation foundational models faster, providing model services with shorter latency and lower costs.
Google announced that the new chip will be available to cloud customers by the end of 2024.