May 2026

Not All Tokens Are Equal

How cheap energy and cheap intelligence will reshape the physical economy.

Cheap energy created the industrial economy. Cheap intelligence will create something far bigger. But intelligence is not magic. It has to be produced somewhere, by machines that consume power, capital, cooling, land, chips, networks, and human coordination.

For the last decade, AI has mostly been discussed as software. That era is ending. Intelligence is becoming an industrial output.

The basic unit of that output is the token: the fragment of text, code, image, sound, action, or reasoning step that an AI system consumes or produces. Tokens are often priced like a cloud API, but they behave more like manufactured goods. They are made in physical places, under physical constraints, with very different economics depending on what they are used for.

The mistake is to talk about "AI compute" as one market. Not all tokens are equal. A background summarization token, a real-time voice token, a frontier training token, and an autonomous-agent token do not have the same value, latency requirement, power profile, or geography.

The future AI economy will be a tiered token economy.

A five-part map of the token economy

The old split, training versus inference, is useful but incomplete. Training explains how models are created. Inference explains how they are used. But the next phase of AI infrastructure will be shaped by workload economics: latency, value density, schedulability, locality, reliability, and energy cost.

A better map has five layers.

Token workload What it does Primary constraint Likely location
Capability creation Pretraining, post-training, evals, synthetic data Scale, networking, power, secure supply Gigawatt-scale hyperscale campuses
Commodity batch Summaries, translation, embeddings, offline processing Lowest cost per useful token Cheap-power regions, off-peak capacity
Premium interactive Chat, coding, AI search, voice, support Latency, uptime, perceived quality Metro and regional inference clusters
Agentic outcome Research agents, coding agents, workflow automation Reliability, orchestration, memory, tool use Distributed always-on inference fleets
Local / autonomous Phones, vehicles, robots, homes, factories, consoles Privacy, autonomy, bandwidth, milliseconds Edge devices and local micro-infrastructure

Capability-creation tokens are the least like a normal commodity. They are inputs into model progress. They require vast clusters, elite engineering teams, advanced networking, and enormous power. These workloads will concentrate in strategic locations: the United States, allied markets with abundant power, the Gulf states, Canada, the Nordics, India, and China's parallel domestic AI stack.

Commodity batch tokens are different. They can wait. They can run overnight. They can move to wherever electrons, cooling, and interconnection are cheapest. This is where AI starts to look like a flexible industrial load. If a task can be delayed by minutes or hours, it can chase curtailed renewables, hydro, nuclear, cheap gas, or underutilized capacity.

Premium interactive tokens are the opposite. When a human is waiting for an answer, milliseconds matter. AI search, voice assistants, coding copilots, customer support, and real-time collaboration need regional infrastructure near users. These tokens may cost more, but they are also worth more because they shape user experience.

Agentic outcome tokens may become the largest and strangest market. A chatbot answers a prompt. An agent performs a job. It plans, searches, calls tools, retries, verifies, stores memory, asks for help, and reports back. One user request can trigger thousands of model calls. In this market the unit of value is not the token. It is the completed task.

Local and autonomous tokens will grow because not everything should travel to a cloud data center. Phones, PCs, vehicles, drones, robots, hospitals, factories, gaming consoles, and homes will all generate local intelligence. Some of it will be small and cheap. Some of it will be mission-critical. A vehicle cannot wait for the cloud to decide whether to brake. A home energy system should not need a distant GPU cluster to decide when to charge a battery.

The council lens

Three frames help keep the analysis honest.

Jensen Huang's "AI factory" metaphor is the right starting point. The modern AI data center is not a generic server farm. It is a factory that turns electricity, data, chips, cooling, and software into intelligence. The output is not steel or chemicals. It is tokens.

Ben Thompson's key insight is that inference changes the business model. Training created the frontier models, but inference is where users, workflows, platforms, and profits show up. The companies that own demand, distribution, operating systems, developer interfaces, and workflow integration may capture more value than the companies that merely produce raw tokens.

Jigar Shah's lens keeps the story honest. Power is not an abstraction. The bottlenecks are interconnection queues, financing, substations, transformers, permitting, transmission, firm capacity, and local politics. AI data centers can either stress the grid or help finance a better one. The difference depends on whether AI load becomes flexible, additional, and aligned with new generation.

The discipline: AI tokens are manufactured, monetized, and constrained. The factory matters. The customer interface matters. The grid matters. Any analysis that ignores one of the three is incomplete.

Why geography changes by token type

Cheap electricity matters, but it does not explain everything.

The cheapest token is not always the most valuable token. A token that arrives too late, violates data residency, fails in a workflow, or cannot be trusted may be worthless. The valuable token is delivered at the right latency, reliability, capability, compliance level, and price.

That means AI infrastructure will not become one giant centralized cloud. It will become a portfolio.

Layer Looks like Drives location
Frontier training Strategic industrial assets — aluminum smelters, chip fabs, military bases Power, land, networking, security, capital, political support
Batch inference Commodity manufacturing chasing cheap electrons and high utilization Curtailed renewables, off-peak capacity, low-cost interconnection
Interactive inference Follows people. Metro and regional clusters near users Response time, data residency, network density
Agentic compute Follows workflows. Distributed fleets with memory and tool access Reliability, orchestration, identity, observability
Edge intelligence Follows machines. Phones, cars, factories, homes, consoles Privacy, autonomy, bandwidth, milliseconds

Frontier training clusters will look like strategic industrial assets. They will locate where power, land, networking, security, capital, and political support line up. These campuses may increasingly resemble aluminum smelters, chip fabs, or military bases: large, energy-intensive, and nationally important.

Batch inference will look more like commodity manufacturing. It will chase cheap electrons and high utilization. If inference costs keep falling, more workloads will become economically possible, creating the classic rebound effect: cheaper tokens may reduce unit energy use while increasing total demand.

Interactive inference will follow people. The constraint is not just power. It is response time. AI search in Tokyo, voice agents in Los Angeles, coding copilots in Denver, and medical AI in London need infrastructure close enough to feel instant and reliable.

Agentic compute will follow workflows. Some agents will need low latency. Others can run in the background. The infrastructure stack will include not just GPUs, but memory, permissions, identity, observability, rollback, data access, and tool execution. This is where token pricing may become less important than outcome pricing.

Edge intelligence will follow machines. The phone becomes an inference device. The car becomes an autonomous compute node. The factory becomes a local AI system. The gaming console becomes a high-performance consumer AI box. A SPAN-style home, with solar, battery, EV, smart panel, sensors, and fiber, starts to look like a tiny energy-aware intelligence node.

Most homes will not train frontier models. But they may run personal assistants, privacy-sensitive inference, energy optimization, security, appliance coordination, home robotics, and local caching. The home becomes part of the intelligence grid.

The weird frontier: orbit

Space data centers are not the near-term answer. Launch cost, maintenance, radiation, heat rejection, and data transfer are severe constraints. But orbital compute is useful as a thought experiment because it reveals the real bottleneck: power.

If terrestrial electricity becomes the limiting reagent for frontier AI, people will consider increasingly strange compute geographies. Orbit offers continuous solar energy and isolation, but the economics are far from proven.

The near-term map: Power-rich terrestrial campuses for training, metro inference clusters for interactive workloads, and edge devices for everything that has to be private, fast, or autonomous. Orbit is the optionality layer for the 2030s.

The bigger idea

The 21st century will be defined by the race to produce abundant energy and abundant intelligence.

Those two races are now becoming one. AI needs energy. Energy systems need intelligence. Data centers are becoming grid actors. Homes are becoming controllable energy nodes. Vehicles are becoming batteries, sensors, and computers. Agents are turning software from a tool into labor. Robotics will move intelligence into the physical world.

This is why the token economy matters. Tokens are not merely software events. They are the first measurable units of cheap intelligence.

The next infrastructure winners may not simply be the companies with the biggest models or the most GPUs. They may be the companies that best integrate compute, energy, networking, inference orchestration, capital, and physical deployment.

Cheap energy created the industrial economy. Cheap intelligence will create something far bigger. But only if we build the factories, grids, networks, devices, and institutions that can produce it.

Related Reading

What I Learned Leading AI at Elephant. A practical playbook for AI adoption inside a small company.

You Probably Don't Need a Panel Upgrade. The home as the smallest unit of the intelligence grid.

The Convergence Thesis. The short version of why energy and intelligence are now the same race.

Josh Lake is a serial entrepreneur thinking about AI and the convergence of energy and intelligence. He's the founder of Electrify Everything Now, and previously co-founded Elephant Energy.

Find him on LinkedIn and X, or go back to joshlake.ai.