NVIDIA GPU, the alarm bell has rung.

NVIDIA GPU, the alarm bell has rung.

Shortly after news broke that France would initiate an antitrust investigation against Nvidia, more bad news has emerged.

According to a warning from EU Competition Commissioner Margrethe Vestager, as cited by Bloomberg, there is a "huge bottleneck" in the supply of Nvidia's AI chips, but the regulatory agency is still considering how to address this issue.

"We've been asking them questions, but this is just the beginning," she told Bloomberg during her trip to Singapore. So far, it "does not meet the conditions for regulatory action."

Since Nvidia became the biggest beneficiary of the AI spending boom, regulators have been keeping an eye on it. Its graphics processing units (GPUs) are favored by data center operators for their ability to handle the vast amount of information needed to develop AI models.

Advertisement

The chips have become one of the hottest commodities in the tech industry, with cloud computing providers competing to acquire them. It is estimated that the demand for Nvidia's H100 processors has helped them gain more than 80% of the market share, leading competitors Intel and Advanced Micro Devices (AMD).

Despite the supply crunch, Vestager said that the secondary market for AI chip supply may help stimulate innovation and fair competition.

However, she indicated that dominant companies may face certain behavioral restrictions in the future.

"If you have this dominant position in the market, there are some things you can't do, but small companies can," she said. "But apart from that, as long as you do your business and respect this, you are fine."

The "big puzzle" of $600 billion

Despite the huge investment in AI infrastructure by high-tech giants, the revenue growth brought by AI has not yet been realized, indicating a huge gap in the end-user value of the ecosystem. In fact, Sequoia Capital analyst David Cahn believes that AI companies must earn about $600 billion a year to cover the costs of their AI infrastructure, such as data centers.Last year, Nvidia's data center hardware revenue reached 47.5 billion U.S. dollars (most of which is computing GPUs used for AI and HPC applications). Companies such as AWS, Google, Meta, and Microsoft invested a huge amount of money in their AI infrastructure for applications like OpenAI's ChatGPT in 2023. However, can they earn back this investment? David Cahn believes that this may mean we are witnessing the growth of a financial bubble.

According to David Cahn's algorithm, the figure of 600 billion U.S. dollars can be derived through some simple mathematical operations.

All you have to do is multiply Nvidia's run-rate revenue forecast by 2 to reflect the total cost of AI data centers (GPUs account for half of the total cost of ownership, and the other half includes energy, buildings, backup generators, etc.). Then you multiply by 2 again to reflect the 50% gross margin of the end users of GPUs (for example, startups or companies that purchase AI computing from Azure, AWS, or GCP, they also need to make money).

Let's see what has changed since September 2023 (at that time, he believed that artificial intelligence was a 200 billion U.S. dollar problem)?

1. The supply shortage has subsided: The end of 2023 was the peak of GPU supply shortages. Startups were calling venture capital firms, calling anyone willing to talk to them, seeking help to obtain GPUs. Today, this concern has almost completely disappeared. For most of the people I have talked to, it is relatively easy to get GPUs with reasonable delivery times now.

2. GPU inventory continues to grow: Nvidia reported in the fourth quarter that about half of its data center revenue came from large cloud providers. Microsoft alone may account for about 22% of Nvidia's fourth-quarter revenue. Hyperscale capital expenditure is reaching historical levels. These investments are the main theme of the earnings of large technology companies in the first quarter of 2024, and the CEOs effectively tell the market: "Whether you like it or not, we will invest in GPUs." Hoarding hardware is not a new phenomenon, and once the inventory is large enough to cause a decline in demand, it will become a catalyst for reset.

3. OpenAI still accounts for the largest share of AI revenue: The Information recently reported that OpenAI's revenue is now 3.4 billion U.S. dollars, up from 1.6 billion U.S. dollars at the end of 2023. Although we have seen a few startups with revenue of less than 100 million U.S. dollars, the gap between OpenAI and other companies is still very large. Besides ChatGPT, how many AI products are consumers really using today? Think about how much value you get from spending 15.49 U.S. dollars per month on Netflix or 11.99 U.S. dollars per month on Spotify. In the long run, AI companies need to provide consumers with tremendous value to continue to pay.

4. The 125 billion U.S. dollar gap has now become a 500 billion U.S. dollar gap: In the final analysis, I generously assumed that Google, Microsoft, Apple, and Meta can generate 10 billion U.S. dollars each year from new AI-related revenue. I also assumed that Oracle, ByteDance, Alibaba, Tencent, X, and Tesla each have 5 billion U.S. dollars in new AI revenue every year. Even if this is still correct and we add a few more companies to the list, the 125 billion U.S. dollar gap will now become a 500 billion U.S. dollar gap.

This is not the end - the B100 is coming: Earlier this year, Nvidia announced the launch of the B100 chip, which has improved performance by 2.5 times, while the cost only increased by 25%. I expect this to lead to the final surge in demand for NVDA chips. Compared with the H100, the B100 has significantly improved cost-performance, and due to everyone wanting to buy the B100 later this year, it is very likely that a supply shortage will occur again.When previously raising questions about GPUs, one of the main counterarguments David Cahn received was "GPU capital expenditure is like building a railway," where eventually the trains will come, and so will the destinations—new agricultural exports, amusement parks, shopping centers, etc.

David Cahn stated that he actually agrees with this point, but he believes that this argument overlooks several aspects:

1. Lack of pricing power: In the case of physical infrastructure construction, the infrastructure you are building has some inherent value. If you own the tracks between San Francisco and Los Angeles, then you may have some kind of monopoly pricing power because there can only be so many tracks laid between location A and location B. In the case of GPU data centers, pricing power is much smaller. GPU computing is increasingly becoming a commodity measured by the hour. Unlike the oligopolistic CPU cloud, new entrants building dedicated AI clouds continue to flood the market. In the absence of monopoly or oligopoly, businesses with high fixed costs + low marginal costs will almost always see price competition to marginal cost (e.g., airlines).

2. Investment waste: Even in the railway industry, and many new technology industries, speculative investment frenzies often lead to substantial capital waste. "The Engines that Move Markets" is a textbook on technology investment, and its main point (indeed, focusing on the railway industry) is that many people suffered heavy losses in speculative technology waves. Picking winners is difficult, but picking losers (in the case of the railway industry, canals) is much easier.

3. Depreciation: From the history of technological development, we know that semiconductors tend to get better and better. Nvidia will continue to produce better next-generation chips, such as the B100. This will lead to an accelerated depreciation of the previous generation of chips. Because the market underestimated the improvement speed of the B100 and the next-generation chips, it overestimated the value of the H100 purchased today in 3-4 years. Similarly, physical infrastructure does not have this similarity; it does not follow any "Moore's Law" type curve, so the cost-performance relationship is continuously improving.

4. Winners and losers: I believe we need to carefully study the winners and losers - there will always be winners in the period of over-infrastructure construction. Artificial intelligence is likely to be the next wave of transformative technological waves, and the decline in GPU computing prices is actually beneficial to long-term innovation and startups. If David Cahn's prediction comes true, it will mainly harm investors. Founders and company builders will continue to develop in the field of artificial intelligence - they will be more likely to succeed because they will benefit from lower costs and experience accumulated during this trial period.

5. Artificial intelligence will create tremendous economic value. Company creators focused on providing value to end users will reap substantial rewards. We are experiencing a technological wave that may define a generation. Companies like Nvidia have played an important role in driving this transformation, deserve praise, and are likely to play a key role in the ecosystem for a long time to come.

However, David Cahn also reiterated that speculative frenzies are part of technology, so there is nothing to be afraid of. Those who keep a clear head at this moment have the opportunity to create extremely important companies. But we must ensure that we do not believe in the delusions that have spread from Silicon Valley to the whole country and even the world. This delusion believes that we will all get rich quickly because AGI will come tomorrow, and we all need to store the only valuable resource, which is the GPU.

"In fact, the road ahead will be long. It will have ups and downs. But it is almost certain that it is worth it." David Cahn emphasized.

Potential ChallengersAlthough this is a topic that has been discussed many times, it seems to have reached a conclusion. As Daniel Newman, CEO of Futurum Group, said, "Currently, there is no archenemy for Nvidia in the world."

The reasons are as follows: Nvidia's Graphics Processing Unit (GPU) was initially created in 1999 for ultra-fast 3D graphics in PC video games, and later proved to be very suitable for training large-scale generative AI models. The scale of models driven by companies such as OpenAI, Google, Meta, Anthropic, and Cohere is getting larger and larger, which in turn requires a large number of AI chips for training. For many years, Nvidia's GPUs have been considered the most powerful and sought-after.

These costs are certainly not small: training top-tier generative AI models requires tens of thousands of the highest-end GPUs, each priced at 30,000 to 40,000 US dollars. For example, Elon Musk recently stated that his company xAI's Grok 3 model needs to be trained on 100,000 top-tier Nvidia GPUs to become a "special thing," which would bring in more than 3 billion US dollars in chip revenue for Nvidia.

However, Nvidia's success is not only a product of the chip, but also the software that makes the chip easy to use. Nvidia's software ecosystem has become the preferred choice for a large number of AI-focused developers, and they have little motivation to switch. At last week's annual shareholder meeting, Nvidia CEO Huang Renxun called the company's software platform CUDA (Compute Unified Device Architecture) a "virtuous cycle." As the number of users increases, Nvidia has the ability to invest more funds to upgrade the ecosystem, thereby attracting more users.

In contrast, Nvidia's semiconductor competitor AMD controls about 12% of the global GPU market. The company does have competitive GPUs and is improving its software, said Newman. However, although it can provide another option for companies that do not want to be bound by Nvidia, it does not have an existing developer user base that finds CUDA easy to use.

In addition, although large cloud service providers such as Amazon's AWS, Microsoft Azure, and Google Cloud all produce their own proprietary chips, they do not intend to replace Nvidia. Instead, they hope to have a variety of AI chips available to optimize their own data center infrastructure, reduce prices, and sell their cloud services to the widest potential customer base.

Jack Gold, an analyst at J. Gold Associates, explained: "Nvidia has an early development momentum, and when you build a rapidly growing market, it is difficult for others to catch up." He said Nvidia has done a good job in creating a unique ecosystem that others do not have.

Matt Bryson, Senior Vice President of Equity Research at Wedbush, added that it would be particularly difficult to replace Nvidia's chips used for training large-scale AI models. He explained that most of the current spending on computing power is directed to this area. "I think this dynamic will not change for some time," he said.

However, an increasing number of AI chip startups, including Cerebras, SambaNova, Groq, and the latest Etched and Axelera, have seen opportunities to take a share of Nvidia's AI chip business. They focus on meeting the special needs of AI companies, especially the so-called "inference," which is running data through already trained AI models to let the model output information (for example, each answer from ChatGPT requires inference).

For example, just last week, Etched raised 120 million US dollars to develop a dedicated chip called Sohu for running transformer models. Transformer models are an AI model architecture used by OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude. According to the introduction, the chip will be produced by TSMC using its 4nm process. The company stated that it has also obtained high-bandwidth memory and server supplies from "top-tier suppliers," but did not disclose the names of these companies. Etched also claims that Sohu is "an order of magnitude faster and cheaper" than Nvidia's upcoming Blackwell GPU, and an eight-chip Sohu server can process more than 500,000 Llama 70B tokens per second. The company made this judgment by inferring the published Nvidia H100 server MLperf benchmark data, which shows that an eight-GPU server can process 23,000 Llama 70B tokens per second. Etched CEO Uberti said in an interview that a Sohu server will replace 160 H100 GPUs.Dutch startup Axelera AI is developing chips for artificial intelligence applications. The company claimed last week to have secured $68 million in financing. The company is raising funds to support its ambitious growth plan. Based in Eindhoven, the company aims to become Europe's version of Nvidia, offering AI chips that are said to be 10 times more energy-efficient and 5 times cheaper than competitors. The core of Axelera's innovation is the Thetis Core chip, which can perform an impressive 260,000 calculations in one cycle, while a regular computer can only perform 16 or 32 calculations. This capability makes it highly suitable for AI neural network computing, mainly vector matrix multiplication. Their chips offer high performance and usability at a cost that is only a fraction of existing market solutions. This can make AI more accessible, allowing a wider range of applications and users to utilize it.

Meanwhile, it is reported that Groq, which focuses on running models at lightning speed, is raising new funds at a valuation of $2.5 billion, while Cerebras is said to have secretly filed for an initial public offering just a few months after releasing its latest chip. The company claims that the chip can train AI models 10 times larger than GPT-4 or Gemini.

All these startups may initially focus on a niche market, such as providing more efficient, faster, or cheaper chips for certain tasks. They may also focus more on specialized chips for specific industries or artificial intelligence devices like personal computers and smartphones. "The best strategy is to carve out a niche market rather than trying to conquer the world, which is what most of them are trying to do," said Jim McGregor, chief analyst at Tirias Research.

Thus, perhaps a more pertinent question is: how much market share can these startups capture alongside cloud providers and semiconductor giants like AMD and Intel? This remains to be seen, especially since the chip market for running AI models or inference is still very new.

Comments