Google Switches Gemini to Compute-Based Usage Limits: What It Means for Your Plan

Priya Deshmukh, PhD

18 May 2026

Add Informer Tech
as a preferred source

Gemini Switches to Compute-Based Usage Limits

If you've been bumping up against your Gemini limits lately, here's something you'll want to know. Google is changing how it counts your weekly usage, and honestly, it's another sign that those tidy flat-rate AI plans we all got used to just can't keep up anymore. The new approach moves away from a simple daily request count and toward something Google calls "compute-based" usage. It's a meaningful shift, and depending on how you use Gemini, it could change your day-to-day experience quite a bit.

What "Compute-Based" Usage Actually Means

Here's the heart of it. Instead of counting how many times you hit send, Gemini now looks at how much work each request actually demands. According to a Google support document, the new compute-based limits factor in the complexity of your prompt, the features you lean on, and the length of your chat.

That last part matters more than it might sound. A quick one-line question and a sprawling, multi-turn conversation aren't treated the same anymore. Neither is a basic text prompt versus something that fires up image and video generation, deep research, or the heavier Pro and extended-thinking models like Deep Think. The more horsepower you ask for, the more it counts against your allowance.

Google has been pretty vague about the exact numbers, which is frustrating but not surprising. The one clear signal: paid users get higher limits than free users. Beyond that, you're mostly working in the dark on specifics.

How the Limits Refresh

There's a rhythm to this new system worth understanding. The compute-based limits for Gemini refresh every five hours, and that cycle continues until you hit a weekly cap. So it's not a single daily reset you can plan your whole day around. Instead, you've got rolling windows that top up through the week, with a ceiling that holds everything in check.

How the Paid Plans Stack Up

This is probably what you actually came here to find out. Google has tied your usage ceiling directly to which plan you're on, and the multipliers are tied to a "standard" baseline that applies to people without a paid plan.

Plan

Monthly Price

Usage Limit

No plan (standard)

Free

Standard baseline

Google AI Plus

$8

2x standard

Google AI Pro

$20

4x standard

Google AI Ultra

$250

20x standard

So if you're on the $8-a-month AI Plus plan, you get twice the standard allowance. Step up to the $20 AI Pro plan and that jumps to four times. And the $250-a-month AI Ultra plan? That's a hefty 20 times the standard limits. Whether that math works out for you depends entirely on how heavily you actually use the more demanding features.

Why Google Made This Change

This isn't really about Google being stingy. It's about the underlying economics of modern AI breaking the old model. Powerful agentic features have a way of spawning sub-agents that can chew through tens of thousands of tokens across multiple turns from a single request. A flat daily count of "100 prompts" just doesn't capture that reality anymore, because one heavy prompt can cost dramatically more than another.

Think about how the old system worked. Google AI Pro users used to get up to 100 Gemini Pro 3.1 prompts per day, and it didn't matter how complicated those prompts were. A trivial question and a research-heavy, multi-step task counted exactly the same. That made sense when AI requests were roughly comparable in cost. It stopped making sense once agentic features arrived and blew the variance wide open.

The Industry Is Moving the Same Direction

Google isn't acting alone here, and that's the bigger story. This move comes less than a month after GitHub overhauled its Copilot plans, ditching its old "premium request units" model in favor of "AI Credits" based on the actual tokens used during AI exchanges. Same logic, different company: charge for what the work actually costs, not for a flat count that ignores complexity.

The big AI providers are all wrestling with the same problem. Ever more powerful agentic features keep raising the compute bill, and the old flat-rate plans simply weren't designed for this kind of load.

Not everyone is tightening the screws, though. Anthropic recently went the other way and doubled the Claude Code limits for its Claude Pro and Max plans. But there's a catch worth noting: that only happened after Anthropic inked a deal with SpaceX to boost its compute capacity. And just last month, an Anthropic exec admitted that the current Claude Pro and Max plans "weren't built" for features like Claude Code and Cowork, the Claude desktop feature that turns AI agents loose on your PC. So even the company bucking the trend basically acknowledged the same underlying strain.

What This Means for Everyday Gemini Users

If you're a light user firing off occasional questions, you may not notice much. The pressure of this change lands hardest on people leaning into the heavy stuff: deep research, image and video generation, long winding conversations, and the extended-thinking models. Those are exactly the features compute-based limits are designed to meter more carefully.

The practical takeaway is to be a little more deliberate. Long chats and feature-heavy prompts now carry more weight against your allowance than a simple back-and-forth. Knowing that the system refreshes every five hours under a weekly cap can also help you pace yourself rather than getting blindsided when you hit a wall mid-project.