On May 20 Google launched Gemini 2.5 Pro and Gemini 2.5 Flash in preliminary version. These new AI models were better than ever, and to demonstrate the company included in its announcement several graphs and comparative tables.
They looked at how both surpassed their rivals both in the field of reasoning and in the traditional performance (Benchmarks of Mathematics or Programming), but there was also another fact that Google presumed: Google: The cost of Gemini 2.5 Flash.


Source: Google.
That table published by Google made it clear that Gemini 2.5 Flash It was clearly the winner of that comparison in the important price/benefits ratio. What Google did not say is that this success of this model was the exception to the rule, because in that race for having cheap and powerful models, China seems to take the lead.
He does it at least if we attend to the cost of using these models. In Xataka we have analyzed that cost based not on the price of subscriptions for end users, but on the cost of access to the API, which is the one that allows developers to integrate these models into their own chatbots and their services.
The API prices of each model clearly differentiate two uses from artificial intelligence. On the one hand, how much does it cost to write something for the model to then process it (the so -called input tokens). On the other, how much does the text generated by the model once processed the answer (the so -called output tokens).
The entrance tokens They are usually five times cheaper than exitbecause processing the request and generating text is much more expensive than receiving it, analyzing and “understanding it.” We wanted to compare the cost of the main models of the AI developed in China and those of the US, and although as always are not all that are, if they are all. The resulting table is as follows:


These prices are public and very easy to find in the case of US AI models (OpenAI, Anthropic, Google) but not so much in the case of Chinese models (Deepseek, Qwen (Alibaba), Doubao (Bytedance), GLM-4 (Zhipu), Ernie (Baidu)).
Be that as it may, the table, ordered from the cheapest to the most expensive, demonstrates that today Chinese models are especially cheap. Only Gemini 2.5 Flash Preview manages to compete – and does it exceptionally. In the rest of the cases, the AI models in China win the battle for cost.
It must be noted that, Like all comparisons, this is unfair. And it is because that table does not take into account the benefits of each model. Openai O3 and Anthropic Claude Opus 4, the last and most powerful models of those companies, are especially precise in their answers, but each consultation consumes much more resources (computation, energy) and that makes it logical that they are much more expensive than their competitors.
But these models are also designed for very special cases and for specialized, detailed and deep consultations. In the vast majority of cases it is not necessary to use these models, and that is where they are competing for example Deepseek R1 or Gemini 2.5 Flash Preview: in the price/benefits ratio.
Models with variable prices
That price battle has made us see in recent times two techniques that some companies are applying to the prices of use of their APIs. The first one is to differentiate normal entries and outputs of inputs (and even exits) cache.


Deepseek API prices. Attentive to the lower left: according to the time you use them, they can leave cheaper. Source: Deepseek.
The explanation is simple: a “normal” entry is a request or question that the model has never processed and therefore has to process completely. If the entrance has been caught (Cache hit) is because that request has been processed in the past, so the system can obtain the response of its cache, which significantly reduces the computational costs. Deepseek, Google, Anthropic and Openai offer this type of option, as can be seen in the table.
The second technique is to use variable prices according to (at the moment) the time slot in which we use these models. This is what Deepseek has done, which has “day” and “night” prices according to UTC schedule. If you use the Depseek API from 18:30 to 2:30 (Peninsular schedule in Spain), it will be half price.
Good news: AI is every time (much) cheaper
While China and the US fight who has the most powerful model or who has the cheapest model, what is constantly happening is that the AI price is falling remarkably.
It is an observation that several experts such as Ethan Mollick, a professor at the University of Pennsylvania who recently analyzed how That price/benefits ratio does nothing but improve. The models are getting better and cheaper.
Other experts such as Raveesh Bhalla – explained by Netflix and LinkedIn – also reflected this evolution at the beginning of the year. Then he showed how the cost of an O1 level model had dropped 27 times in the last three months. Moreover, at this rhythm the GPT4 level models – which a year ago were absolute referents – will be reduced 1,000 times in just 18 months.
We are living it in price reduction. Dane Bahey, from Openai, said at a conference in September last year how the cost per million tokens had fallen from 36 dollars At just 0.25 dollars In the last 18 months. And that price drop is still clear and fantastic for users.
Thus, we are facing a career that at the moment has a lot of stretch: China’s models carry the lead if we attend only to their cost, but careful, because we must also take into account the benefits. It is true that these Chinese models have already shown in the benchmarks that have been showing that they compete from you to you with the best US models, and now it remains to be seen who will end up taking the cat to the water.
For now, yes, There is an absolute winner in that race: usersthat we have an AI that is better and cheaper every day that passes.
Image | Joshua Hoehne | Alejandro Luengo
GIPHY App Key not set. Please check settings