Part #/ Keyword
All Products

Google's Advanced Supercomputer Empowers AI Development with Custom TPU Chips

2023-04-11 13:17:52Mr.Ming
twitter photos
twitter photos
twitter photos
Google's Advanced Supercomputer Empowers AI Development with Custom TPU Chips

According to recent reports, Google, a subsidiary of Alphabet, has revealed the latest details about its supercomputers used to train artificial intelligence models. The company claims that these systems are faster and more energy-efficient than similar systems from Nvidia.

Google has its own customized TPU chips, and more than 90% of its AI training tasks are completed using these chips. AI training involves feeding data into the model to give it the ability to perform tasks like human-like text chatting and image generation.

Google's TPU is currently in its fourth generation, and the company has published a paper detailing how it uses its custom optical switches to combine over 4,000 chips into a supercomputer, connecting multiple independent machines together.

Improving the efficiency of these connections has become an important competitive point for tech companies developing AI supercomputers, as Google's Bard and OpenAI's ChatGPT use large language models that are too large to run on a single chip. These models must be distributed across thousands of chips, working together for several weeks or even longer to train the model.

PaLM is the largest language model publicly disclosed by Google to date, and it needs to be allocated to two supercomputers, each containing 4,000 chips, for more than 50 days of training.

Google says its supercomputer allows for easy reconfiguration of connections between chips during operation, helping to avoid issues and adjust performance.

"Switching circuits can easily bypass faulty components," wrote Google fellow Norm Jouppi and distinguished engineer David Patterson in a blog post. "This flexibility can even allow us to change the supercomputer's interconnect topology to improve machine learning model performance."

Although Google has just released the technical details of its supercomputer, the system was already used in the company's data center in Oklahoma in 2020. Google said that startup Midjourney used the system to train its models, and their tools can generate brand new images with just simple text.

In the paper, Google said its chips are 1.7 times faster than Nvidia's A100 chips and 1.9 times more energy-efficient. An Nvidia spokesperson declined to comment on this comparison.

Google said it did not compare its fourth-generation TPU to Nvidia's existing H100 flagship chip because the latter was released later and used newer technology.

Google hinted that it may be developing a new generation of TPU that can compete with Nvidia's H100, but did not disclose any details. Jouppi said in an interview that Google has a "healthy pipeline of future chips."

* Solemnly declare: The copyright of this article belongs to the original author. The reprinted article is only for the purpose of disseminating more information. If the author's information is marked incorrectly, please contact us to modify or delete it as soon as possible. Thank you for your attention!