AI's Insatiable Appetite for Compute
Nvidia's H200 GPU, the workhorse of modern AI data centers, is in critically short supply. New orders placed today face a minimum six-month wait for delivery, up from three months at the end of 2025 and roughly two weeks when the chip launched in mid-2024. The shortage reflects the explosive growth in AI compute demand that is outstripping even Nvidia's expanded manufacturing capacity.
What Is Driving the Shortage
Multiple factors are converging to create the supply crunch.
- Training demand: OpenAI, Google, Anthropic, Meta, and xAI are all training next-generation models that require tens of thousands of H200 chips each.
- Inference scaling: As AI applications reach hundreds of millions of users, the compute needed for inference, running trained models to serve user requests, has grown faster than anticipated.
- Enterprise adoption: Companies across every industry are building AI infrastructure, competing for the same limited chip supply as the hyperscalers.
- Sovereign AI: Governments worldwide are building national AI compute infrastructure, adding another layer of demand.
Impact on the Industry
The shortage is having ripple effects across the technology sector. Cloud providers are limiting the availability of H200-based instances, with some providers implementing waitlists for new customers. Startups that need GPU capacity to train models are finding it increasingly difficult to secure compute, creating what some founders describe as a two-tier market where well-funded companies with existing Nvidia relationships get priority.
"The GPU shortage is becoming the defining bottleneck of the AI era. Companies with compute have a massive competitive advantage over those without it," said Dylan Patel, chief analyst at SemiAnalysis.
Secondary market prices for H200 chips have risen 40% above list price. Some brokers are offering H200 systems at significant premiums with shorter delivery times, creating a gray market that Nvidia has tried to discourage but cannot fully prevent.
Nvidia's Response
Nvidia is working to increase production capacity but faces its own supply chain constraints. The H200 is manufactured by TSMC using its advanced 4nm process, and TSMC's capacity allocation is committed months in advance. Nvidia CEO Jensen Huang has said the company is working with TSMC to expand dedicated capacity and is also advancing the timeline for the next-generation Blackwell B200 chip, which offers 2.5x the performance per chip and could alleviate some demand pressure by reducing the number of chips needed per workload.
Alternative Approaches
The shortage is accelerating interest in alternative AI chip architectures. AMD's MI350 chip has seen a significant uptick in adoption, though it still lacks the software ecosystem maturity of Nvidia's CUDA platform. Custom chips from Google (TPU v6), Amazon (Trainium2), and Microsoft (Maia) are being deployed within those companies' own clouds, reducing their dependence on Nvidia for some workloads.
Several startups, including Cerebras, Groq, and SambaNova, are also seeing increased interest from customers frustrated by Nvidia availability. While these alternatives are not drop-in replacements, they offer competitive performance for specific workloads, particularly inference.
Outlook
Analysts expect the shortage to persist through at least the end of 2026. The fundamental dynamic of AI compute demand growing faster than semiconductor manufacturing capacity is unlikely to change in the near term. For companies planning AI infrastructure investments, the message is clear: order early and plan for long lead times.
