The Cheapest Token Is the One You Never Spend

Why the most valuable AI strategy points at simplification before automation.

Jul 04, 2026

The short version. Most enterprises are deploying AI to help the organization handle more work. The larger opportunity is using it to need less. The evidence for the gap is mounting: MIT’s NANDA initiative reported that 95 percent of generative AI pilots in its analysis had not yet delivered measurable financial impact, and Gartner expects more than 40 percent of agentic projects to be canceled by 2027. The common thread is not model quality. It is what the model is pointed at. Two principles a century apart, Gates on automation and Jevons on falling cost, both warn that technology amplifies whatever you aim it at, for better or worse. The disciplined move is to use AI to simplify and eliminate unnecessary work before automating it. The cheapest token is the one you never spend.

Every enterprise rolling out AI is making a structural choice, and most do not realize it.

The choice is not which model to license or which platform to standardize on. It runs deeper than that, and it quietly shapes the return on every dollar spent. The choice is whether you are using AI to help the organization handle more work or to help the organization eliminate the work it needs to handle.

Those two goals sound adjacent. They are nearly opposite. And the default setting, in most enterprises, is the first one.

The two postures diverge at every level. One handles more requests; the other eliminates them. One speeds up coordination; the other removes the need for it. One layers automation onto existing complexity; the other reduces the complexity first. One spends more tokens as it scales; the other spends fewer. They can look alike on a roadmap. They build very different organizations.

The Default Setting Is “Do More”

The dominant AI strategy in 2026 is broad distribution. Everyone gets a license, everyone is encouraged to explore, and leadership waits for productivity gains to surface from the bottom up. The stated hope, usually somewhere in an all-hands deck, is that people will find their own ways to cut costs and improve quality.

That hope is not unreasonable, and the pressure behind it is real. Boards are asking about the AI plan. Competitors are announcing initiatives. Nobody wants to be the organization that moved too slowly. Broad deployment is the most visible way to demonstrate momentum, and visible momentum seems to be what this moment rewards.

But visible momentum and realized value are not the same thing, and the gap between them is now well documented.

What the Evidence Already Shows

In 2025, MIT’s NANDA initiative published one of the more sobering assessments of enterprise AI to date. An analysis of roughly 300 deployments found that 95 percent of generative AI pilots had not yet delivered measurable impact on the bottom line, despite an estimated $30 billion to $40 billion in collective investment. The researchers called the result the “GenAI Divide”: high adoption, low transformation.

The instructive part is the cause. The study did not blame the technology. It blamed integration. The organizations seeing little return were the ones treating AI as a tool to scatter broadly rather than a capability to build into how work actually gets done. The money, the researchers noted, was flowing toward the most visible functions rather than the ones where the returns were highest.

An independent reading from Deloitte points the same way. Its 2026 survey of more than 3,200 leaders across 24 countries found that workforce access to sanctioned AI tools jumped by roughly half in a single year, reaching about 60 percent of employees, while only about a quarter of organizations had moved even 40 percent of their experiments into production. Access raced ahead. Value lagged behind.

Gartner’s view of the agent wave is no kinder. The firm forecasts that more than 40 percent of agentic AI projects will be canceled by the end of 2027, citing escalating cost, unclear business value, and weak risk controls. Its analysts put a finer point on it: many of the use cases being built as agents today do not require agents at all. An agent that plans, reasons, and calls tools to do what a simpler automation already handled is not a breakthrough. It is the same work at a higher price.

Read these findings together and a pattern emerges. The problem is rarely the model. The problem is what the model is being pointed at.

An Old Rule, Newly Urgent

There is a principle from the early days of business computing that has held for thirty years. In 1996, Bill Gates wrote that automation applied to an inefficient operation will magnify the inefficiency. Apply technology to a process that works, and you amplify what works. Apply it to a process that does not, and you amplify the waste, now faster, more consistent, and far harder to unwind.

This is the rule most enterprise AI strategies are not built to respect, because most are built around deployment rather than understanding. The instinct is to find work and automate it. The discipline that gets skipped is the one that asks, before automating anything, whether the work should exist in its current form.

A caution belongs here. Not all complexity is waste. Some of it encodes legitimate control: regulatory requirements, safety checks, segregation of duties that exist for good reason. The discipline is not to strip complexity out wholesale. It is to separate the complexity that earns its place from the complexity that merely accumulated, and to do that before automating either one.

This is where AI is useful in a way unrelated to automation. Large language models are unusually good at showing how work actually moves through an organization: synthesizing across sources no person could hold at once, surfacing the redundancies and exceptions that accumulate invisibly in complex processes, and offering alternative framings that expose what a workflow is really doing. Pointed at a process before you automate it, AI becomes an instrument of clarity. An organization that uses it that way tends to discover that a meaningful share of what it was about to automate could instead be simplified, consolidated, or eliminated. The automation that follows is then aimed at work that has earned its place.

A concrete version makes the point. A support organization can build an agent that drafts a reply to every customer escalation, or it can use AI to find that a handful of upstream approval steps generate a large share of those escalations in the first place. The first option automates the symptom. The second removes the cause, and the escalations it prevents never need a token at all.

That sequence is not a delay. It is what makes the eventual automation worth building.

The Cost You Cannot See

There is a second problem hiding underneath the first. I watched it play out once already, in the early cloud era.

When enterprises first moved to cloud infrastructure, departmental consumption was largely invisible to the people generating it. Costs flowed into a central budget line, no manager had a clear signal of what their own usage was costing, and spending grew without the friction that accountability creates. An entire discipline, FinOps, emerged to restore that visibility.

AI token consumption is following the same path, and in many organizations it is further along than anyone realizes. The unit of cost, the token, does not appear in standard dashboards. Most department managers responsible for their own budgets have no line of sight into what their teams’ AI use is actually costing, because that cost sits on a centralized bill that nobody with operational authority is monitoring. The discipline that would prompt a manager to ask, “Is this worth it?” never gets triggered because the price tag is invisible.

The result compounds quietly. Industry practitioners describe pilot workloads that ran at $10,000 a month swelling to $400,000 a month in production, with no single decision causing the jump. It simply accumulated, across dozens of features and teams, none of them watching the meter.

A manager operating without that visibility is not being careless. They are being asked to be responsible for something they cannot see. That is not a personal failure. It is a governance gap, and governance is a leadership decision.

Why Cheaper Tokens Make This Worse, Not Better

Here is the objection that usually surfaces at this point: the cost of AI is falling so fast that it will solve itself. Per-token prices are collapsing. Why worry about a number that keeps dropping?

Because the bill is going up anyway, and the reason it is going up is the whole point.

The pattern has a name that predates computing entirely. In 1865, the economist William Stanley Jevons observed that as coal-burning engines became more efficient, England did not burn less coal. It burned far more because cheaper energy made entirely new uses viable. Efficiency expanded the market faster than it reduced unit costs.

The same thing is happening with AI, and the people running the largest AI businesses are saying so openly. When cheaper inference briefly rattled the market in early 2025, Microsoft’s Satya Nadella greeted it with a single line: Jevons paradox strikes again. Apollo’s chief economist, Torsten Slok, has put numbers to it this year. The price of a single token has fallen roughly 90 percent since 2023, and over the same stretch, total enterprise spending on these models has been climbing, not falling. As tokens get cheaper, Slok observed, companies do not pocket the savings. They run more agents, automate more workflows, and generate more code. The unit cost of intelligence collapses while the aggregate bill keeps rising.

The mechanism doing this is precisely the one enterprises are racing toward. A task that costs a few cents as a single question can cost dollars when rebuilt as an agent that plans, loops, and calls other systems along the way. Cheaper tokens do not restrain that. They invite more of it.

This brings the argument to its sharpest edge. Falling cost does not reward discipline. It amplifies whatever you have pointed the technology at. Aim it at validated, valuable work, and the expanding economics work in your favor. Aim it at work that should not exist, and you have built a machine that consumes more every quarter to do something nobody needed. Treating the falling price as a reason not to forecast is not a strategy. It is the most reliable way to end up owing your board an explanation.

Two Laws, One Warning

Step back and notice what is actually on the table.

Two principles, more than a century apart, are converging on this single moment. Gates says automation magnifies the operation you apply it to. Jevons says falling cost magnifies the consumption of the resource you apply it to. Both are about amplification. Both are now pointing at the same decision.

Point AI at simplified, necessary work, and the two laws compound in your favor: the automation magnifies an efficient process, and the falling cost funds more of something worth doing. Point it at inherited complexity, and the same two laws turn against you: the automation magnifies the inefficiency, and the economics pour ever more spend into sustaining it.

There is only one move that escapes both. Eliminating unnecessary work does not magnify it or fund it. It removes the work from the field entirely. The cheapest token, the one with no future cost and no compounding exposure, is the one you designed your way out of ever needing to spend.

The Choice Underneath the Tooling

This is the thread that runs under everything I write here. The question that matters in enterprise technology is rarely which tool you bought. It is whether the structure you built around it reduces the burden the enterprise carries, or simply relocates that burden somewhere harder to see.

I made a version of this argument in an earlier post about as-a-Service models: when you move operational execution to a provider, the overhead does not disappear, it relocates. AI presents the same fork in sharper form. Used to handle more, it relocates the cost of unexamined work into a token bill that grows as the price falls. Used to need less, it reduces the work itself.

Under undisciplined adoption, the coordination structures and cost burden an enterprise carries can transform, persist, or, in some cases, compound. Which of those happens is not determined by the model you chose. It is determined by what you decide to point it at. Simplification is the choice that lets those structures truly transform. Automating the inherited mess is the choice that makes them compound, now magnified by one old law and funded by another.

So, before the next question about which use cases to automate or which agents to build, there is a prior one worth sitting with. If you had to describe your organization’s actual theory of value for AI, not the tools deployed or the seats licensed, but the value framework underneath them, how confident are you in the answer?

None of this is free of friction. Simplification produces less visible progress than deployment. One creates fewer workflows; the other creates more demonstrations, and a leader measured on adoption feels that difference. But demonstrations measure activity, and activity is not the same as value.

This points toward a change in how leaders champion the technology, and it costs nothing to make. The encouragement can stay exactly as enthusiastic, but it carries a sequence: spend your tokens on simplification before you spend them on automation. Same broad access, same permission to explore, aimed first at understanding and reducing the work rather than encoding it. That single shift turns a scattered experiment into a disciplined one, and it points the organization’s spend at the gains that hold.

The enterprises that look back on this period with satisfaction will not be the ones that moved fastest. They will be the ones who ask what to simplify before they ask what to automate, and use AI as an instrument of clarity rather than a substitute for it. That is the harder path. It is also the one that produces something worth building on.

Systemic Inflections examines the organizational and structural questions beneath enterprise technology strategy, informed by four decades in the field and active research into how enterprises reorganize around as-a-Service models. If that is the work you are doing, subscribe to get each post as it publishes.

Systemic Inflections

Discussion about this post

Ready for more?