Liwa / Insights / Google's TPU vs NVIDIA's CUDA

Google's TPU vs NVIDIA's CUDA: is the moat cracking?

Liwa Insights·4 February 2026·9 min read

Here is a puzzle. If a competing chip were meaningfully cheaper to run, you'd expect buyers to switch. Google's latest TPU is reportedly far cheaper per unit of work, and yet NVIDIA still owns roughly four-fifths of the market. Both things are true at once. What's holding the dam?

The challenger is no longer hypothetical

Google's seventh-generation TPU, Ironwood, is generally available and aimed squarely at large-scale inference, Google's "agentic era." It targets a 4× inference improvement and is described as nearly 30× more efficient than the first generation, with TPUs generally delivering 2-3× better performance per watt than GPUs (Introl, VentureBeat). And the volume is real: Google plans to make more than 3 million TPUs in 2026 and around 5 million in 2027 (Longbridge).

The economics bite. SemiAnalysis estimates total cost of ownership for an Ironwood-based server is roughly 44% lower than an equivalent NVIDIA GB200 server, and even after Google's margin, external customers such as Anthropic see around a 30% cost reduction (GO Markets, Pillitteri).

~44%Lower TCO: Ironwood vs GB200 server (est.)

81%NVIDIA's share of AI DC chips (IDC)

3M+TPUs Google plans to build in 2026

So why is NVIDIA still 81%?

IDC puts NVIDIA at roughly 81% of the AI data-center chip market (GO Markets). The reason is a word that's become industry shorthand: the CUDA moat. Fifteen years of libraries, kernels, tooling and muscle memory mean most of the world's AI code assumes NVIDIA underneath. Switching isn't a procurement decision; it's a re-engineering project. As the framing goes, NVIDIA locks you in with CUDA; Google opens a side door with TPU plus Gemini (Longbridge).

A moat made of software is strongest when compute is cheap and switching is optional. But when compute becomes your largest cost line, a 30-44% saving starts paying for the migration. At what price of GPU-hours does loyalty become irrational?

The answer most people miss

Notice what the entire TPU-vs-CUDA debate is about: chips and software. Notice what it never argues about: the building. Whether the winning accelerator is a Rubin GPU, an Ironwood TPU, or an AMD Instinct part, every one of them is a dense, power-hungry, liquid-cooled device that has to live somewhere with cheap electricity and serious cooling.

The chip wars are a coin flip you don't have to call. The infrastructure underneath is the same bet no matter who wins. In a gold rush, you can argue about which prospector is fastest, or you can sell everyone the same picks and shovels.

Where this meets Liwa

Liwa is deliberately silicon-agnostic. Bring NVIDIA, bring AMD Instinct, bring a TPU-class accelerator, our racks are rated to 150 kW, liquid-cooled, with power at $0.10/kWh. We don't ask you to bet on a winner in the chip war. We give you the one thing all the contenders need: a place to run flat-out, cheaply. You keep optionality; we keep the lights on.

Questions we're sitting with

If your inference bill is your biggest cost, what's stopping a 30%+ saving from funding a move off CUDA, habit, or genuine lock-in?
Should a builder commit a facility to one vendor's roadmap, or design for whatever silicon wins the decade?
When the chip is a coin flip, is "cheap power + dense cooling" the only durable bet on the board?

Hedging the chip war? Bet on the infrastructure.

Vendor-agnostic, liquid-cooled, 150 kW racks at $0.10/kWh, NVIDIA, AMD, or whatever ships next.

Configure your space →Talk to the founder

Sources

TCO and efficiency figures are third-party estimates (incl. SemiAnalysis as cited in the above) as of 2026 and depend heavily on workload.