A year ago the message from the top was simple: use the best models, move fast, never mind the cost. In 2026 the invoices arrived, and the mood changed in a hurry.

Uber burned through its entire 2026 AI coding budget by April. Microsoft pulled most of its internal AI coding licenses months after rolling them out. One company reportedly ran up a $500 million bill in a single month because nobody set a usage limit. As J.R. Storment, who runs the FinOps Foundation, described it, the conversation flipped from tokenmaxxing and going fast to something far more sober.

"We need guardrails. How do we control this?"
— J.R. Storment, Executive Director, FinOps Foundation

If that sounds familiar, you are not behind and you are not alone. But the reflex it triggers — slam the brakes and cut the AI budget — is the wrong one. The problem was never spending on AI. It was spending without intention.

How a cheaper technology produced bigger bills

Here is the part that catches people off guard: AI got cheaper, and the bills went up.

The price of a token has fallen by roughly 98% since late 2022. Over the same stretch, enterprise AI bills rose by an estimated 320%. The average business now spends about 13 times more on AI tokens than it did in January 2025.

98%

drop in the price of a token since late 2022

Epoch AI

320%

rise in enterprise AI bills over the same period

13×

more spent on AI tokens than in January 2025

The cause is agents. A chatbot answers one question. An agent loops through dozens of steps to finish a task, and every step burns tokens. Per-developer consumption at large companies rose about 18.6× in nine months, and Goldman Sachs expects token use to multiply 24× by 2030. Cheaper per unit, far more units — and the companies selling the tokens are, understandably, in no hurry to slow that down.

18.6×

jump in per-developer token consumption in nine months

FinOps Foundation, via TechCrunch

24×

projected growth in token use by 2030

Goldman Sachs

The pricing model is shifting underfoot, too. Through 2025, most enterprises bought flat per-seat plans and the vendor absorbed heavy usage. That math collapsed once agents multiplied consumption tenfold, so the labs moved to usage-based billing and handed the cost risk back to the buyer. The trouble is that token costs are variable and hard to forecast.

~40%

average budget overshoot on consumption-based AI contracts

Zylo 2026 SaaS Management Index

~5%

overshoot on the old seat-based licenses they replaced

You cannot plan around a number that swings that far.

The ROI gap is real, and so is the fear

The discomfort underneath all of this is justified. Bain found that nearly 40% of companies that measured their AI cost savings came in under 10% — well short of the 11 to 20% they were targeting. Forrester reports that only 15% of AI decision-makers saw AI-related earnings increases in the past year, and fewer than a third can link AI to income at all.

40%

of companies that measured AI savings came in under 10%, short of the 11–20% they targeted

Bain & Company

15%

of AI decision-makers saw AI-related earnings increases in the past year

Forrester, via CIO

44%

plan to fund the next AI wave from savings the last one never delivered

Bain & Company

Yet 90% of companies are raising their AI budgets again, and nearly half plan to fund the next wave out of savings the last wave never actually delivered. That is the trap in one sentence: the ROI does not math out yet, but the fear of falling behind does. So the spending continues on momentum rather than evidence.

Intention is the way out of that loop. Not less ambition — more deliberateness about what you are buying, and why.

Don't overcorrect by slowing down

It is worth saying plainly, because the pendulum is swinging hard right now: most companies are still early. Only about a quarter have moved past pilots into scaled use, and nearly half are still running a handful of narrow projects. If you pull back now, you pay for all the experimentation and capture none of the payoff.

The teams that win the next year will not be the ones that spent the most or the least. They will be the ones that knew what they were running, why, and what it was worth. Here is what that discipline looks like in practice.

Five habits of spending with intention

Spending with intention is not a budget cut. It is a set of habits. Five of them matter most.

Habit 01

Go deeper, not just wider

The early wins were task-level — a faster email, a quicker summary. Real, but almost impossible to tie to the bottom line. The money shows up when you automate a whole workflow end to end. One caution: do not automate a broken process. AI does not fix a messy workflow, it locks it in and speeds it up. Redesign the process first, then point AI at it.

Habit 02

Measure impact, not activity

Tokens used, prompts sent, and seats filled tell you nothing about value. One study found the heaviest token users were about twice as productive — but spent ten times the tokens to get there. Decide upfront what a use case is supposed to change — a cost, a cycle time, an error rate — and measure that. If it is not on the leadership dashboard, teams optimize for the wrong thing.

Habit 03

Put guardrails on before you scale

The half-billion-dollar bill happened because no one set a limit. Usage caps per team, an approval step before an autonomous agent goes wide, and a clear view of who is running what are not bureaucracy. They are the difference between a budget you can forecast and one that surprises you in April.

Habit 04

Keep your options open on models

You do not need the most expensive model for most work. The price of a given level of capability has been falling 5 to 10× a year, and today's mid-tier models often approach last year's flagship at a fraction of the cost. Match the model to the task, route easy work to the cheap option, and where a job can run as plain deterministic code, do that instead of paying a model to redo it every time.

Habit 05

Use fewer, smart-enough agents

The reflex is to throw an autonomous agent and the biggest model at everything. Most tasks do not need it. A smaller number of well-scoped agents, each pointed at a defined job, will cost less and break less than a swarm of expensive ones improvising their way through your systems. Smart enough beats smartest possible almost every time.

Proof it pays off: Amazon's finance team rebuilt how it tracks tax-rule changes across markets, taking a task that ran 26 minutes per update down to 2 — a 92% drop, with most of the output accepted as-is. That is what going deeper looks like when you redesign the process first and point AI at the whole workflow, not a single step.

Tokenmaxxing was a phase, not a strategy

It made sense for about a year, while everyone learned what these tools could do. That period is ending. The labs are moving to usage-based pricing, finance teams are catching up, and the cost of intelligence keeps dropping for anyone willing to be deliberate about it.

So do not read the headlines about runaway bills as a reason to retreat. Read them as a reason to get specific. Stay aggressive on the value AI can create, and get disciplined about what it costs to create it. That gap — between spending and spending with intention — is where the next year gets won.

Key Takeaways

AI got cheaper per token, but agents multiplied usage — enterprise AI bills are up an estimated 320% even as token prices fell 98%.
Cutting the budget is the wrong reflex. Most companies are still early, and pulling back means paying for the experiments while skipping the payoff.
Spend with intention: automate whole workflows, measure business impact over activity, and set usage guardrails before you scale, not after.
Match the model to the task and run fewer, well-scoped agents. Smart enough beats smartest possible almost every time.

The AI bill came due. Now spend with intention.

How a cheaper technology produced bigger bills

The ROI gap is real, and so is the fear

Don't overcorrect by slowing down

Five habits of spending with intention

Go deeper, not just wider

Measure impact, not activity

Put guardrails on before you scale

Keep your options open on models

Use fewer, smart-enough agents

Tokenmaxxing was a phase, not a strategy

Key Takeaways

Sources & further reading

Have questions about this topic?

Related Insights

Where AI Actually Saves Money: The Boring Work, Not the Chatbot

A Tiered Model for AI: Match the Oversight to the Risk

Why Generic AI Training Fails, and What Works Instead