For most of the last fifteen years we talked about artificial intelligence as if it had no body. It lived in “the cloud,” a phrase chosen precisely because it suggested weightlessness. Something floating above us, infinitely available, nowhere in particular. You could not point at it. You could only point up.
That story is quietly ending. The cloud has an address now. It has a substation, a water permit, a zoning hearing, and a line of trucks idling at a gate in rural Louisiana, where one campus alone sprawls across thousands of acres, larger than the city park most of us grew up walking through. The thing we once described as ambient has become the most physical industry on the planet outside of energy itself. And in a turn almost nobody priced in three years ago, it has become hard to get.
We spent a decade assuming the limiting factor in AI would be ideas. The next breakthrough in architecture, the next clever training trick, the next leap in reasoning. Those still matter. But sit with founders, infrastructure operators, or the people who actually sign hardware contracts, and you hear something different. The frontier is no longer gated mostly by intelligence. It is gated by the physical capacity to run intelligence at all. By chips, by electricity, by land, by cooling, by financing, and by the unglamorous work of coordinating all of it at a scale the world has never attempted.
Compute became the constraint. And once you see it, you cannot stop seeing it underneath everything else.
What “compute” actually means now
When people say compute, they usually picture a chip. That picture is too small. Compute in 2026 is a stack, and every layer of that stack is straining.
At the top is the accelerator, the specialized processor that does the heavy mathematical lifting of training and running models. The most capable of these are effectively rationed. Nvidia’s Blackwell generation has been sold out well into the middle of this year, with a reported backlog measured in the millions of units. The largest cloud providers placed multibillion-dollar orders in advance and absorbed most of the available supply, which means a queue formed behind them. That queue is not really a queue of chips. It is a queue of ambitions. Startups, research labs, hospitals, banks, and entire countries are standing in it, waiting for physical capacity to catch up to what they already know they want to build.
Below the accelerator sits a quieter bottleneck that almost no one outside the industry talks about: memory. The high-bandwidth memory that feeds these chips has been booked solid for the year, with the major manufacturers stating plainly that their 2026 capacity is already gone. One chairman in the memory business has suggested the crunch could last until the end of the decade. Pricing in some parts of the memory market has become so volatile that buyers describe it shifting by the hour. The advanced packaging that bonds these components together is its own chokepoint, with a small number of facilities in the world capable of doing it at the required precision. A shortage at any one of these layers becomes a shortage of the whole.
The result is a split that did not exist a few years ago. There is now a compute-rich world and a compute-poor world, and the line between them is not about who has the best ideas. It is about who secured allocation early, who can finance multiyear forward commitments, and who has the relationships to get to the front of the line. Access to compute is no longer assumed. It is earned, negotiated, or, increasingly, quietly denied.
Why AI demand bends the curve
Here is the part that makes this more than a temporary supply hiccup. The shape of demand itself has changed.
For years the heavy compute story was about training. You spent enormous resources once, produced a model, and then served it relatively cheaply. The cost lived in the past tense. That framing no longer holds. The center of gravity is moving toward inference, the cost of actually running a model every time someone uses it. Every question asked, every document summarized, every image generated carries a real, recurring marginal cost in electricity and silicon. When a few million people use a tool, that cost is a rounding error. When a few billion do, and they do it constantly, it becomes one of the largest line items in the digital economy.
Then came agents. This is the development that quietly rewrites the math. A person using a chatbot sends a message and waits. An agent does not wait. It works in loops, calling a model many times to plan, check itself, search, revise, and act, often running for minutes or hours without a human in the loop. A single human request can fan out into hundreds of model calls. Multiply that across software that increasingly delegates work to these systems, and you get a demand curve that does not grow with the number of users. It grows with the number of tasks those users are willing to hand off. That number is effectively unbounded.
This is why efficiency gains, real and impressive as they are, keep getting swallowed. The cost per individual task has been falling at a pace that has few parallels in the history of technology. And yet total consumption keeps climbing, because cheaper compute does not reduce demand. It unlocks new demand that was previously uneconomical. We make each unit cheaper and then immediately find a thousand new uses for it. The savings evaporate into scale.
So the question stops being “how much power does one query use.” It becomes “how much new capacity does the world have to build, where will it physically sit, and what will it run on.” And that question lands squarely on top of the least flexible system we have.
The grid was not built for this
You can move a software company overnight. You cannot move a power grid.
This is the collision happening right now. A modern AI data center can be designed and built in two or three years. The transmission lines, substations, and generation capacity that feed it take far longer, sometimes a decade, to plan and permit and pour. The technology sector sprints. The energy sector walks. And the gap between those two speeds is where the next phase of the AI story is being decided.
The numbers have stopped being abstract. Globally, electricity demand from data centers is on track to roughly double by the end of the decade, and the portion driven specifically by AI is set to climb several times faster than that. In the United States, the concentration is even more striking: data centers accounted for around half of all growth in the country’s electricity demand last year. Industry estimates suggest U.S. data center power needs could nearly double again in just three years, adding the equivalent of a midsized country’s entire electricity appetite to the grid in that window. In certain regions this is no longer a forecast. In Virginia’s “Data Center Alley,” these facilities already consume something close to a third of the entire state’s electricity.
And this is no longer a story about a few famous places. Drive thirty miles south of Milwaukee, to the Wisconsin village of Mount Pleasant, and you will find the clearest version of it. On land that was cleared years ago for a Foxconn factory that mostly never arrived, Microsoft is now building data centers instead. Early in 2026 the village board approved fifteen more of them across two new campuses, a development valued at more than thirteen billion dollars, on top of the seven billion the company is already spending nearby. Once fully built, those campuses are expected to draw around two gigawatts of power, roughly what a major city consumes, and up to eight million gallons of water a year. One analysis warned that the broader wave of data center demand could add well over a hundred billion dollars to Wisconsin’s electricity system costs in the decades ahead. The same conversation is now unfolding in Port Washington, Beaver Dam, Janesville, and a string of other towns. The map of the AI buildout is no longer coastal and abstract. It runs through farmland and former factory sites in the middle of the country, and the people who live nearby are the first to feel the weight of it.
That demand does not arrive politely. Data centers draw enormous, concentrated, around-the-clock loads, and they do not tolerate outages the way a factory or a neighborhood can. They scale up in one spot, all at once, which is precisely the hardest kind of demand for a grid to absorb. So the industry has started reaching for whatever can keep the lights on: long-term power contracts signed directly with producers, onsite gas, large banks of batteries, and a sudden, serious revival of nuclear interest, including the small modular reactors that were dismissed as too speculative not long ago. The line between technology companies and energy companies is blurring, because at this scale they are increasingly the same business.
And ordinary people have started to notice. The buildout that was supposed to be invisible is showing up on electricity bills and in local politics. Communities are organizing against new campuses they see as extractive, drawing water and power and tax breaks while leaving little behind. Public sentiment toward AI has cooled noticeably, and the data center has become the physical object onto which a more diffuse anxiety gets projected. The abstraction has a body now, and bodies can be protested.
Compute becomes a question of sovereignty
Once a resource becomes both scarce and essential, it stops being purely commercial. It becomes political. This is the moment compute is living through.
For most of the cloud era, governments operated on a comfortable assumption: that computation could run anywhere, scale endlessly, and remain largely indifferent to borders. That assumption has collapsed. Nations have concluded that depending on foreign-owned infrastructure to run their most sensitive workloads, their public services, their defense systems, their economic models, is a strategic vulnerability they are not willing to accept. So they have started building.
France has committed on the order of a hundred billion euros to domestic AI infrastructure under a national plan, with goals to deploy more than a million accelerators and to lean on its nuclear fleet to power them. Its president frames this openly as a fight for sovereignty, a refusal to choose between dependence on American technology and Chinese state control. Canada has stood up a sovereign compute program to fund a nationally controlled supercomputer for its researchers and firms. The Gulf states have moved fastest and largest, striking infrastructure deals measured in the tens of billions of dollars, including arrangements that let one country host compute capacity on behalf of another, a concept that starts to resemble a digital embassy. At Davos and at the major AI summits this year, the central question was no longer how to regulate the technology in the abstract. It was who can build domestic capacity, and how fast.
It is tempting to reach for the obvious comparison and call compute the new oil. The comparison is useful but incomplete. Oil is extracted and burned. Compute is closer to a refined product that sits at the end of an unforgiving supply chain: raw silicon, advanced fabrication concentrated in a handful of facilities on Earth, specialized memory, packaging, and then the energy to run it all. You cannot simply drill for it. A country can have ambition, capital, and even raw power and still find itself standing in the same line as everyone else, waiting on a factory halfway around the world. That dependency, more than any single chip, is what keeps strategists awake. Whoever can access, allocate, finance, and scale compute most effectively will shape the next decade, and everyone knows it.
The economic divide nobody voted for
Strip away the geopolitics and there is a simpler, more personal story underneath. Compute scarcity quietly redraws who gets to build the future.
For startups, the change is stark. A decade ago, a credit card and a cloud account were enough to compete with anyone. The barrier to entry was talent and taste, not capital. That is no longer reliably true at the frontier. When the most capable hardware is rationed and goes first to those who can commit to enormous, multiyear contracts, the advantage tilts back toward incumbents and the very well funded. Some companies lock in capacity years in advance. Others refresh dashboards and adjust their entire roadmap around whatever happens to be available this quarter. The garage-to-giant story that defined the internet does not disappear, but it gets harder, and it bends toward those who already have scale.
For ordinary people, the effect is more diffuse and arrives later, but it is real. If running AI is expensive, then access to the best of it will be priced and tiered, the way every scarce resource eventually is. There is a version of the next decade where the most capable systems become a premium good, available in full to wealthy individuals, large companies, and rich countries, and available only in throttled, secondhand form to everyone else. We have a name for that pattern from the early internet. We called it the digital divide, and it was mostly about who had a connection. The new version is about who has compute, and it cuts deeper, because this time the resource is not just access to information. It is access to capability itself.
For labor, the pressure works from both directions. The same systems that strain the grid are the ones reshaping what work looks like, and the speed at which that reshaping happens is now partly a function of how much compute exists to run it. Scarcity is, oddly, a brake on disruption. It may turn out that the limiting factor on how fast AI changes the job market is not what the models can do, but how many of them the world can afford to run at once.
Wall Street is already pricing it
When a resource becomes this scarce and this essential, finance does not wait politely for it to be solved. It starts to price it. And the clearest sign that compute has crossed from ordinary input to genuine commodity came this spring from the person whose job is to notice exactly that.
Larry Fink, who runs the largest asset manager in the world, stood on a stage at the Milken Institute conference and predicted that an entirely new asset class is about to form. “A new asset class will be buying futures of compute,” he said, arguing that the country is short on compute, chips, memory, and power all at once. He placed computation in the same category as oil and grain, the commodities markets already hedge through forward contracts, and suggested companies will soon want to lock in future access to processing capacity the way an airline locks in the price of fuel.
This is the oil comparison from earlier returning in a sharper form. When the head of an eleven trillion dollar firm starts describing computation the way the market describes crude, the framing carries weight, and capital reorganizes around it. Investment flows toward the chips, the power, the cooling, and the land, the physical layer beneath the software, on the theory that owning the supply of a scarce resource may matter more than owning any one thing built on top of it.
There is a catch, though, and it is revealing. A futures market needs a unit. A barrel of oil is always a barrel of oil. A unit of compute is not so stable. It shifts with every hardware generation and every change in how AI workloads behave, and no one has yet agreed on what the standard measure should even be. Before compute can trade like a commodity, the world has to settle on how to count it, meter it, and verify it. That turns out to be less a question for traders than a question about the plumbing of the internet itself.
https://medium.com/media/af9451a34e8145a257c92ba250f35a68/href
The coordination layer the internet is missing
There is a deeper problem hiding beneath the shortage, and it is not a hardware problem. It is a coordination problem.
We are building a world in which compute is metered, allocated, traded, and contested, in which software agents will need to pay for the resources they consume and prove what they did with them, in which capacity may sit idle in one place while demand goes unmet in another. The internet we have was never designed for this. It was designed to move information, not to coordinate a global market in computation with its own prices, its own settlement, and its own rules. We have been bolting that coordination on top, awkwardly, through contracts and dashboards and trust in a few large intermediaries.
This is where a set of ideas that grew up in a different corner of technology becomes relevant, almost in spite of itself. The more useful threads in crypto were never really about speculation. They were about programmable coordination: the ability to define economic rules in software, to settle value between parties who do not know each other, and to let a system rather than a company hold the ledger. Strip away the noise and what remains is a question that the compute era is now asking in earnest. How do you coordinate a resource, transparently and at scale, when no single actor should be trusted to control all of it.
Most blockchains only settle transactions. A smaller line of work has tried to do something more ambitious: to run actual computation on a decentralized network, so that applications, and eventually autonomous agents, can execute and pay and persist without sitting inside any one company’s data center. The Internet Computer is one of the clearer attempts at this, treating computation itself as something the network performs rather than merely records. Whether that specific approach wins is not the point. The point is the shape of the need. As compute becomes the scarce thing that everything else depends on, the systems we build to allocate it will need an economic coordination layer that is open, programmable, and not owned by a single gatekeeper. The infrastructure scramble is the visible event. The quieter one is the search for who, or what, gets to coordinate it.
A few ways this could go
I am wary of confident predictions, so think of these as live possibilities rather than forecasts.
The shortage could ease faster than expected. Manufacturing capacity could come online, efficiency could keep compounding, smaller and cheaper models could do more of the everyday work, and the panic of the early years could look, in hindsight, like the overbuilt fiber of the late 1990s: a glut that quietly became the foundation everything else was built on.
Or it could harden into structure. Compute could remain expensive and rationed for years, access could calcify into a durable divide between the compute-rich and everyone else, and the geography of intelligence could concentrate in a handful of countries and companies that got there first. In that world, the most important inequality of the century is not income or even information. It is capability.
The most likely outcome is messier than either, a long stretch in which abundance and scarcity coexist. Compute gets cheaper per unit and more contested in total. New power comes online and new demand swallows it. Sovereign capacity rises alongside a few dominant global providers. The divide widens in some places and shared infrastructure narrows it in others. There is no clean resolution, just a permanent negotiation over the most strategically important resource we have ever built our economy on top of.
What the hum is telling us
If you stand near one of these campuses, the thing you notice is the sound. A low, constant hum, the noise of cooling and power and computation that never sleeps. It is the sound of the abstraction becoming concrete, of “the cloud” turning back into machinery and land and electricity, into something with weight and consequence and a place on the map.
For a long time we measured technological progress by what software could imagine. We are entering a period where it will be measured, just as much, by what infrastructure can support. The ideas are running ahead of the ground beneath them, and the ground is where the next decade will actually be decided. Not only in the breakthroughs, but in the substations and the supply chains and the slow, physical work of building enough of the world to run the future we already designed.
The remarkable part is how little of this is visible from the surface of ordinary life. You open an app, you ask a question, you get an answer in a second, and nothing about that moment reveals the scramble underneath it. But it is there, in the queue of ambitions waiting on a factory, in the power line that will not be built in time, in the quiet decision about who gets allocation and who waits.
The cloud got heavy. We are only beginning to reckon with what it will take to hold it up.
Everything Is Downstream of Compute was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story.
