Google's TPU vs NVIDIA's CUDA: is the moat cracking?
Here is a puzzle. If a competing chip were meaningfully cheaper to run, you'd expect buyers to switch. Google's latest TPU is reportedly far cheaper per unit of work, and yet NVIDIA still owns roughly four-fifths of the market. Both things are true at once. What's holding the dam?
The challenger is no longer hypothetical
Google's seventh-generation TPU, Ironwood, is generally available and aimed squarely at large-scale inference, Google's "agentic era." It targets a 4× inference improvement and is described as nearly 30× more efficient than the first generation, with TPUs generally delivering 2-3× better performance per watt than GPUs (Introl, VentureBeat). And the volume is real: Google plans to make more than 3 million TPUs in 2026 and around 5 million in 2027 (Longbridge).
The economics bite. SemiAnalysis estimates total cost of ownership for an Ironwood-based server is roughly 44% lower than an equivalent NVIDIA GB200 server, and even after Google's margin, external customers such as Anthropic see around a 30% cost reduction (GO Markets, Pillitteri).
So why is NVIDIA still 81%?
IDC puts NVIDIA at roughly 81% of the AI data-center chip market (GO Markets). The reason is a word that's become industry shorthand: the CUDA moat. Fifteen years of libraries, kernels, tooling and muscle memory mean most of the world's AI code assumes NVIDIA underneath. Switching isn't a procurement decision; it's a re-engineering project. As the framing goes, NVIDIA locks you in with CUDA; Google opens a side door with TPU plus Gemini (Longbridge).
The answer most people miss
Notice what the entire TPU-vs-CUDA debate is about: chips and software. Notice what it never argues about: the building. Whether the winning accelerator is a Rubin GPU, an Ironwood TPU, or an AMD Instinct part, every one of them is a dense, power-hungry, liquid-cooled device that has to live somewhere with cheap electricity and serious cooling.
The chip wars are a coin flip you don't have to call. The infrastructure underneath is the same bet no matter who wins. In a gold rush, you can argue about which prospector is fastest, or you can sell everyone the same picks and shovels.
Liwa is deliberately silicon-agnostic. Bring NVIDIA, bring AMD Instinct, bring a TPU-class accelerator, our racks are rated to 150 kW, liquid-cooled, with power at $0.10/kWh. We don't ask you to bet on a winner in the chip war. We give you the one thing all the contenders need: a place to run flat-out, cheaply. You keep optionality; we keep the lights on.
Questions we're sitting with
- If your inference bill is your biggest cost, what's stopping a 30%+ saving from funding a move off CUDA, habit, or genuine lock-in?
- Should a builder commit a facility to one vendor's roadmap, or design for whatever silicon wins the decade?
- When the chip is a coin flip, is "cheap power + dense cooling" the only durable bet on the board?
Hedging the chip war? Bet on the infrastructure.
Vendor-agnostic, liquid-cooled, 150 kW racks at $0.10/kWh, NVIDIA, AMD, or whatever ships next.
Sources
- Longbridge, TPU vs GPU: can NVIDIA's moat hold?
- GO Markets, NVIDIA vs Google TPU (market share)
- VentureBeat, How TPUs reshape large-scale AI economics
- Introl, TPU vs GPU decision framework
- Pillitteri, NVIDIA vs Google TPU / Anthropic (2026)
TCO and efficiency figures are third-party estimates (incl. SemiAnalysis as cited in the above) as of 2026 and depend heavily on workload.