In the past two weeks, people all over the world wanted to chat with a chatbot called ChatGPT, which caused the server to go down five times in two days...
This AI program launched by the artificial intelligence laboratory OpenAI at the end of November last year had more than 1 million registrations within 5 days of its launch, and it had exceeded 100 million by the end of January this year! You know, it took TikTok 9 months to reach the small goal of 100 million, but Twitter took 60 months! It's a pity not to bring goods at this "increasing fan" speed...
Who is ChatGPT that can instantly become a "popular fried chicken" on a global scale and make Chinese AI concept stock investors ecstatic? What does its explosion indicate? What is the relationship between it and the chip behind it?
After reading this article, you will know the following:
1. What is ChatGPT?
2. The chip behind ChatGPT
What the Hell is ChatGPT?
Officially, ChatGPT is an artificial intelligence chat robot program developed by OpenAI. Taking its name apart, "Chat" refers to chatting, which is its display form; "GPT" stands for Generative Pre-trained Transformer (generated pre-trained transformation model), which is the pre-trained model that supports its operation. ChatGPT uses the Transformer architecture and trains a large amount of text data, which can perform tasks such as language translation, question, and answer, and dialogue.
In terms of experience, it is like a high-end version of "Little Love Classmate" who has ideas and can learn. It can remember the information you provide through continuous dialogue, to continuously enrich its corpus and provide answers.
That's right, opposite you is a real tool "person".
AI has been around for so long, why is it so popular?
The core structure of GPT is Transformer. If you want to fully understand ChatGPT, you have to start with it.
Transformer is a deep learning model that uses a self-attention mechanism. In short, neural networks are a very effective type of model for analyzing complex data types such as images, video, audio, and text.
Before it, the way we used deep learning to process input sequences was by using a model called a RNN called a recurrent neural network. It requires that the input data must be in the correct order, because it can only output one-to-one in order, so it is difficult for RNN to achieve parallelization, which means that we cannot speed up training by using more GPUs.
Transformer can process all input data at one time, no longer requires the order to be accurate, and can perform parallel operations very effectively. That said, given the right hardware, we can train some really large models.GPT is developed based on this architecture.
The first generation of GPT-1 was released in 2018, using the Transformer architecture as a feature extractor for the first time, which solved the defects and efficiency problems of the traditional RNN structure. However, the amount of data at that time was only more than 100 million, which was pitifully small compared with the current hundreds of billions, so its performance was not good, and it was only used in question answering, semantic similarity evaluation, semantic determination, and text classification tasks. After all, the larger the amount of data, the more foundations machine learning can learn, and the higher the possibility of more accurate and smarter effects.
GPT-2 will be released next year, the underlying architecture has not changed, and the number of parameters has been greatly improved. By GPT-3, the number of model parameters has reached 175 billion, and it can already complete real AI creations such as answering questions, writing papers, text summarization, language translation, and generating computer code.
Why was it so quiet in the previous editions, but it suddenly exploded in ChatGPT?
The Chip Behind ChatGPT
General content creation is divided into three stages: professionally produced content (PGC), user-generated content (UGC), and AI-generated content (AI Generated Content, AIGC). Early professional video portals adopted the PGC model, but now some mainstream social platforms such as Douyin, Xiaohongshu, Weibo, etc. use UGC as the core communication method.
At present, PGC and UGC are mainly used, and AIGC is supplemented. AIGC is considered to be a new content creation method after PGC and UGC.
ChatGPT is only a part of AIGC, such as AI writing, AI composing, and the previously popular AI painting are all one of AIGC. With the maturity of AI technology, the past "artificial mental retardation" has transformed into a "digital human", playing a role in entertainment, finance, retail, etc.: as early as the 2016 Rio Olympics, writing robots participated in the event reports. The digital human series launched by Baidu alone includes AI sign language anchors, virtual idols, virtual editors, etc. Its AI technology has been used in CCTV and participated in the report of the two sessions last year.
ChatGPT is a super-intelligent dialogue AI product based on a large-scale language model. Whether it is discussing the conceptual AIGC or the explosive ChatGPT, it is essentially discussing the AI industry chain behind it.
The three major elements of artificial intelligence: data, algorithms, and computing power, these three elements complement each other and are indispensable.
As mentioned above, ChatGPT is an upgrade based on OpenAI’s third-generation large model GPT-3. From the first generation to the third generation of GPT, the number of model parameters has reached 175 billion. Theoretically, the number of parameters will increase with the increase in computing power. Growth, where the limit is unknown.
The source of computing power chips. The popularity of ChatGPT is bound to promote the development of the AI chip industry.
The AIGC industry chain can be divided into the computing hardware layer, cloud computing platform, model layer, and application layer. Although ChatGPT competes at the model layer and application layer, the computing hardware layer is undoubtedly the backbone behind it.
AI computing chip generally refers to accelerating AI applications, mainly divided into GPU, FPGA, and ASIC. Since the computing power of the CPU is very limited, and it is difficult to process parallel operations, the CPU is generally used with an acceleration chip. Specifically, the GPU was originally a chip for image processing, but it is highly versatile and suitable for large-scale parallel computing, and its computing power is far superior to that of the CPU, so it is very suitable for artificial intelligence, a data-intensive application scenario; The advantage of FPGA lies in its short development cycle and high flexibility, and it is widely used in online data centers and military fields. The advantages of ASIC lie in its miniaturization, low power consumption, and high performance. It is generally used in consumer electronics and is also suitable for AI computing scenarios.
The computing cluster behind ChatGPT uses AI chips from Nvidia. OpenAI has stated that ChatGPT is a super AI completed in cooperation with Nvidia and Microsoft. Microsoft built a supercomputer cluster in its own cloud, the Azure HPC Cloud, and provided it to OpenAI. It is reported that the supercomputer has 285,000 CPU (central processing unit) cores and more than 10,000 AI chips.
In addition to computing power chips, AI dialogue programs require large-capacity, high-speed storage support during calculations, and it is expected that the demand for high-performance memory chips will also increase. Samsung Electronics said that demand for high-performance high-bandwidth memory (HBM), which provides data for GPUs and artificial intelligence accelerators, will expand. In the long run, as AI chatbot services expand, demand for high-performance HBM of 128GB or more for CPUs and high-capacity server DRAM is expected to increase.
Cost reduction and power consumption reduction have become the direction of developing AI-specific chips. It is understood that the cost of purchasing a top-level GPU from Nvidia is 80,000 yuan, and the cost of a GPU server usually exceeds 400,000 yuan. For ChatGPT, the cost of one model training is more than 12 million US dollars. As OpenAI CEO SamAltman once said in a tweet: "Every time a user chats with ChatGPT, it costs a few cents."
With the development of SoC technology, more and more companies have begun to launch self-developed AI-specific chips, such as Google's tensor processor TPU, Nvidia's data processor (DPU) BlueField series, Baidu's Kunlun series, Huawei's Teng series, Hanguang 800 of Alibaba Pingtouge, etc. According to data from EO Intelligence, with the increase of large computing power centers and the gradual implementation of terminal applications, the demand for AI chips in China continues to rise. In 2021, the epidemic will ease and the market will recover, resulting in a large increase; new chips such as brain-like chips are expected to enter mass production in 2023 at the earliest, so there may be a large increase in 2024 and 2025, and the market size is expected to reach 174 billion yuan in 2025.
From the perspective of the computing function of AI chips, since the AI application model must first be trained, tuned, and tested in the cloud, the amount of calculated data and the number of tasks executed are tens of thousands, and the demand for cloud training is the mainstream demand of the AI chip market. In the later stage, the trained AI application model is transferred to the end side, combined with real-time data to perform reasoning operations and release AI functions. Reasoning needs gradually replace training needs, driving the rise of the reasoning chip market. In 2025, cloud-based reasoning and device-side reasoning will become the main driving force for the growth of the market, which will increase the year-on-year growth rate of the gradually declining AI chip market.
In the early years, the upsurge of AI entrepreneurship and the crazy influx of capital is still vivid. At that time, the "AI experts" who appeared out of thin air and the "AI programmers" who were robbed to the sky finally fell silent after the tide receded. The success of ChatGPT breaking the circle seems to have seen the reappearance of the original "passion". But how long will it take for ChatGPT to land? What can landing bring to people? How much will it cost during the landing process? Is this cost affordable?
When talking about the cost issue, Xiaoice CEO Li Di said that if the ChatGPT method is used, the daily cost of Xiaoice will be as high as 300 million yuan, and the annual cost will exceed 100 billion yuan.
Not to mention the day and night electricity bills, the high chip prices, and operating costs may be able to dissuade many "ambitious" companies. So, where can ChatGPT develop? How much can it drive the high-end chip industry? Everything is inconclusive.