The NVIDIA Tax
I live in Virginia, which means I have a front-row seat to the strangest tax increase in modern American life. Nobody voted for it. It isn't on any ballot. But it's showing up on the electric bills of people who have never typed a prompt into a chatbot and wouldn't know a GPU from a garden hose.
In January, Consumer Reports profiled a man in Manassas who had lived in the same house for nearly forty years and opened an electricity bill for $281 - roughly triple what he'd paid the month before. He is not a heavy user. He did not buy a data center. He simply happens to live near Data Center Alley," the Northern Virginia corridor where, by Consumer Reports' accounting, electricity prices in the densest data-center counties have climbed 267 percent over five years. Nearly three-quarters of Virginia voters now blame those buildings for what's happening to their bills, and they aren't wrong to. Abigail Spanberger won the governorship partly on a promise to do something about it.
This is not just a Virginia story. Residential electricity prices nationwide are up more than 36 percent since 2020. Goldman Sachs told clients that power prices rose 6.9 percent in 2025 - more than double headline inflation - and that data centers will account for fully 40 percent of the growth in electricity demand through the end of the decade. The mechanism by which this lands on ordinary people is almost elegantly unfair: in at least 40 states, utilities are allowed to bill customers in advance for grid construction that hasn't been finished yet. So the retiree in Manassas isn't just paying for the power the data centers use. He is pre-financing the substations being built to feed them.
Here is the part nobody says out loud at the ribbon-cuttings. That $281 bill is, in part, a margin payment to a single chip company in Santa Clara.
Naming the taxNVIDIA controls something like 81 percent of the data-center AI chip market. It did $193.7 billion in data-center revenue last fiscal year at a gross margin around 75 percent. On the individual flagship parts the math is even more startling - one widely-cited teardown pegs the build cost of a top GPU near $3,300 against a sale price around $28,000. That's an 88 percent margin. It is one of the great pricing-power stories in the history of manufactured goods, and good for them. I mean that. NVIDIA earned its position over seventeen years while the rest of the industry laughed at the idea that graphics cards mattered.
But a margin like that is not a price. It's a tax. And a tax has to be paid by someone. The whole AI economy is, at bottom, an elaborate machine for distributing that bill - and the person in Manassas is at the end of the chain, paying NVIDIA's gross margin through his electric meter without ever seeing the invoice.
So let's go up the chain and look at everyone else who's paying it, because once you see the full guest list, you start to wonder whether the problem is really NVIDIA at all.
Even Lisa Su can't fix itStart with the one person you'd expect to be able to fight back: Lisa Su.
I want to be careful here, because Lisa Su is one of the best chief executives in technology, full stop. She took a company that was nearly a punchline and made it a genuine force. When she stood on the CES stage in January and held up the Venice" EPYC processor and the new Helios racks, she was not bluffing - that is real, excellent silicon.
And watch what she did with it. She called Venice the best AI CPU" - and then defined best AI CPU" entirely as the chip that feeds the GPUs fastest. A Helios rack carries roughly 4,600 of those world-class CPU cores, and their assigned job is to shuttle data to the GPUs and keep them fed. Hundreds of thousands of dollars of the finest general-purpose processors on earth, deployed as a butler to the accelerators.
I don't say this to mock AMD. I say it because it's the most important fact in the whole story. If Lisa Su - with the best CPUs in the industry sitting right there in the rack - cannot imagine those cores doing anything but waiting on the GPUs, then the assumption we're dealing with isn't a company's blind spot. It's the water the entire industry swims in. The CPU is right there. Nobody can see it.
Everybody's building a cheaper GPUThe hyperscalers are the biggest taxpayers of all, and the best proof that this is a tax rather than a price - because they are spending unfathomable sums to stop paying it. Combined hyperscaler capital spending will run somewhere around $660 to $690 billion this year. A large and growing slice of that is going into custom silicon - Google's TPUs, Amazon's Trainium, Microsoft's Maia, Meta's MTIA - chips they designed themselves specifically to get out from under NVIDIA's margin, which analysts estimate carries a 40 to 65 percent total-cost-of-ownership advantage. When Midjourney moved its inference off GPUs onto Google's TPUs, its monthly compute bill reportedly fell from $2.1 million to $700,000.
Companies do not design their own processors to save loose change. They do it the way you'd dig an escape tunnel. And the AI labs are digging too - Anthropic now trains on hundreds of thousands of Amazon's Trainium chips and is one of the largest TPU customers in the world. OpenAI is designing its own ASICs with Broadcom.
But here's what strikes me about all of it: every single escape tunnel leads to the same place. A cheaper GPU. A custom accelerator. A better, faster, more efficient thing to run the model on. Nobody in this picture - not AMD, not Google, not OpenAI - is asking the prior question, which is whether the workload needed an accelerator in the first place. They are all answering how do we pay less tax" and none of them is asking why am I being taxed for this transaction at all."
The money goes in a circleBefore I get to that question, look at the last group of payers, because they're the ones the bull market would rather you didn't think about.
The financing underneath this boom has gone circular in a way that should make anyone who lived through 2001 sit up. NVIDIA committed up to $100 billion to OpenAI; OpenAI turns around and spends to fill data centers with NVIDIA chips. Oracle signed a cloud deal with OpenAI reported at $300 billion. AMD struck arrangements worth a reported $200 billion that included handing a customer equity warrants. All told, analysts have tallied more than $800 billion in these loops, in which the chip vendors and cloud providers are simultaneously the investors in, and the suppliers to, the customers buying their products - while OpenAI is projected to lose around $14 billion this year.
We have seen this movie. In the late-1990s fiber boom, the equipment makers used vendor financing to let the carriers keep buying gear; when real demand came in under the forecast, the model snapped, and the dark fiber sat unused for years. The difference between a flywheel and a house of cards is whether real end-user demand is actually showing up, or whether the same dollars are just going around the table getting counted as revenue each lap. Nobody knows which one this is yet. But the systemic risk is plain: when everyone is each other's investor, supplier, and customer, one stumble can cascade through the whole ring. If you own an index fund, you own a seat at that table. You're paying the tax too - you just don't get a bill.
The wrong layerSo who pays the NVIDIA tax? Hyperscalers tunneling out through custom chips. Labs doing the same. The neoclouds reselling the markup. A retiree in Manassas. Anyone with a 401(k). It turns out the answer is almost everyone," which is usually the sign that you're looking at something structural rather than something that one better product will fix.
And that's the thing I keep coming back to. Every proposed cure is a better, cheaper, more efficient version of the same machine. A faster horse. The entire industry has decided the second source for AI compute is a different accelerator, when the real second source might be a different architecture entirely.
Because here is the uncomfortable arithmetic: roughly two-thirds of AI compute today is inference, not training. And a very large share of inference is not creative generation at all - it's retrieval. Looking something up. Checking a fact against a record. Pulling the right paragraph out of a known document. Those are jobs a CPU has done beautifully and cheaply for decades, at a few watts, the kind of work those 4,600 idle EPYC cores could do in their sleep if anyone asked them to. We route it to a 700-watt GPU anyway, because the industry decided years ago that AI means GPU, and nobody has stopped to recheck the premise since.
That's the tax. Not the chip. The assumption. The retiree in Manassas isn't paying a surcharge on silicon. He's paying a surcharge on a question nobody is asking.
DisclosureI should tell you that I am not a neutral observer. I co-founded a small company called 2Brains that is built on exactly the argument I just made - that a large fraction of enterprise AI queries never needed a GPU, and ought to run on a CPU at a fraction of the watts, with the accelerator reserved for the work that genuinely requires it. We think it's most of the traffic; that's our bet, and I'll let the benchmarks rather than my adjectives make the case over time.
I'm disclosing that not to pitch you - you can ignore my company entirely and the column stands - but because I want you to know where the idea comes from, and because the conflict cuts the other way too: I've spent enough time staring at this problem to be quite sure the question is wrong, and being sure is exactly when a writer owes you the disclosure.
The tunnel everyone's digging leads to a cheaper GPU. The exit nobody's looking for is the door marked did this query need one at all." Until somebody opens it, the bill keeps landing in Manassas.
The post The NVIDIA Tax first appeared on I, Cringely.
Digital Branding
Web DesignMarketing