
I had lunch with the CEO of an AI infrastructure company recently. I can’t tell you their name, but they said something that really caught my attention: There will be a crop of new AI models later this year that will be a lot better and more efficient.
This will likely make AI tokens more abundant and radically cheaper. (Tokens are the basic units models use to process information, and the standard way AI use is measured and priced).
Hand-wringing about tokenmaxxing could die down. Or, users could go on another bender and burn even more tokens with abandon.
Either way, the price of tokens is probably about to plummet. This is why we already see some AI model providers slashing prices, and other players talking about doing so.
OpenAI CEO Sam Altman recently said AI costs had become a huge issue, adding that the startup will have “a lot of ways we can help people get more value for less spend.”
This trend may already be showing up in the data. A closely watched token spending index run by Silicon Data peaked at around 2.06 in late May and fell to 1.75 as of June 10.
Carmen Li, the CEO of Silicon Data, told me this could mean token prices are dropping across many AI models.
Blackwell finally emerges
The main force driving token prices lower is a new wave of technology that’s sweeping through AI data centers.
Nvidia’s Blackwell GPUs are being installed in huge volumes right now. By the second half of this year, these systems, which are really supercomputers rather than chips, will be operating at scale, helping AI labs train new models and run them more efficiently.
These systems took a while to install properly, partly because they needed to be water-cooled and required other gnarly new data center setups. But the payoff could be huge.
50 x more, 35 x cheaper
SemiAnalysis, a respected AI research firm, compared Nvidia’s top Blackwell system, the GB 300 NVL72, to Nvidia’s previous system, called the Hopper HGX 200.
With the older system, each GPU generated 90 tokens per second, while the new Blackwell system generated 6,000. That’s 65 times more.
These systems consume massive amounts of electricity, and the newer Blackwell offerings use even more. So SemiAnalysis also looked at how many tokens each system generated per megawatt. On this measure, Hopper churned out 54,000 tokens per second, while Blackwell generated 2.8 million. 50 times more.
Electricity prices are rising, due to all these energy-sipping AI data centers. So these days, GPU systems are assessed based on how much it costs to generate one million tokens.
SemiAnalysis tested this, too, and found that the older Hopper system cost $4.20 for every million tokens. The Blackwell system cost 12 cents. That’s 35 times cheaper.
Again, new AI models will be increasingly trained and run on these new Blackwell systems as 2026 progresses. This is very likely to produce a massive increase in the number of cheaply-generated tokens.
This is why AI model providers will probably slash token prices: Because they can.
Sign up for BI’s Tech Memo newsletter here. Reach out to me via email at [email protected].
Read the original article on Business Insider
The post Why AI token prices are about to plummet appeared first on Business Insider.




