Google Cloud TPU 8t & 8i: New AI Chips to Rival Nvidia

The Eighth Generation TPU Splits Into Two Distinct Chips

Google Cloud's eighth generation of custom AI chips — called tensor processing units, or TPUs — takes a different approach than any previous version. Instead of a single chip trying to do everything, the new generation is split into two purpose-built designs: the TPU 8t, built specifically for training AI models, and the TPU 8i, engineered for inference.

That distinction matters more than it might seem at first glance. Training is the heavy, upfront work of building a model — computationally brutal, power-hungry, and time-consuming. Inference is everything that happens after that: the moment a user sends a prompt and the model actually responds. These two tasks have very different demands, and Google is betting that specialized chips serve each one better than a general-purpose design ever could.

Performance Numbers That Are Hard to Ignore

The specs Google is claiming for the new TPUs are genuinely impressive. Compared to previous generations, the new chips deliver up to 3x faster AI model training, with 80% better performance per dollar. And in what might be the most striking figure of all, up to one million TPUs can now work together inside a single cluster.

That's not just a raw performance upgrade — it's a fundamental shift in the economics of running AI at scale. More compute, less energy, lower costs for customers. The whole premise is that you get dramatically more horsepower without the bill ballooning to match. For enterprises trying to figure out how to run serious AI workloads without hemorrhaging money on cloud costs, that's a real conversation starter.

Google calls these chips TPUs rather than GPUs for a historically specific reason: the original custom low-power chips were named "Tensor," a nod to the mathematical structures at the heart of machine learning. The name stuck through eight generations.

Google Isn't Declaring War on Nvidia — Not Yet, Anyway

Here's where it gets interesting. You might read "Google announces new AI chips" and assume this is a direct offensive against Nvidia. It's not — at least not right now.

Like Amazon and Microsoft, Google is using its custom chips to supplement Nvidia's hardware in its cloud infrastructure, not replace it. In fact, Google has made a point of promising that Nvidia's latest chip, Vera Rubin, will be available on its cloud later this year. That's not the move of a company trying to shut Nvidia out.

The longer game is something else. The theory goes that as more enterprises shift their AI needs into the major clouds and start porting applications to chips like Google's TPUs, the hyperscalers — Google, Amazon, Microsoft — may gradually need Nvidia less. But that's a slow-moving shift, not a sudden one, and nothing about the current moment suggests Nvidia is in trouble.

A Decade of Predictions That Didn't Quite Pan Out

Chip market analyst Patrick Moorhead made that point in a wry way, noting on X that he had predicted back in 2016 — when Google launched its very first TPU — that the chip could spell bad news for Nvidia. Since then, Nvidia has grown into a company with a market cap approaching $5 trillion. The prediction didn't exactly age well.

The practical reality is that if Google keeps growing as an AI cloud provider, that growth likely means more business for Nvidia, not less, even if a significant share of workloads run on Google's own silicon. It's not zero-sum — at least not yet.

Google and Nvidia Are Actually Collaborating

This is the part that tends to get overlooked. Far from a cold war, Google and Nvidia are actively working together to make Nvidia-based systems run more efficiently inside Google's cloud infrastructure.

Specifically, the two companies are collaborating to enhance a software-based networking technology called Falcon. Google originally created Falcon and open sourced it in 2023 under the Open Compute Project — the open source data center hardware organization that underpins a lot of modern cloud infrastructure. The goal of the joint effort is to allow Nvidia-based systems to operate with even better performance inside Google's network. It's a partnership that makes sense for both sides: Google gets a stronger cloud offering, and Nvidia gets a more capable home for its chips.