Google just changed the rules of the lightweight model

Now, in the race to lead the development of artificial intelligence, something unusual has just happened. Gemini 3 FlashGoogle’s new model, has surpassed GPT-5.2 Extra High, the higher-reasoning variant of OpenAI, in several performance tests. And that forces us to rethink some of the rules that we took for granted.

A fast model that also reasons. Google’s new model comes with a very specific promise: to demonstrate that “speed and scalability do not have to come at the expense of intelligence.” Although it has been designed with efficiency in mind, both in cost and speed, Google insists that Gemini 3 Flash also excels at reasoning tasks.

According to the company, the model can adjust your thinking ability. It is able to “think” for longer when the use case requires it, but it also uses 30% fewer tokens on average than Gemini 2.5 Promeasured with typical traffic, to complete a wide variety of tasks with high precision and without penalizing response times.

The truth is in the benchmarks. Are the benchmarks perfect? No. But they are still one of the most useful tools we have for comparing AI models.confront them against each other and detect in which scenarios they perform better or worse. And in this area, Gemini 3 Flash comes out well.

In SimpleQA Verifieda test that measures reliability in knowledge questions, Gemini 3 Flash achieves 68.7% compared to 38.0% for GPT-5.2 Extra High. In multimodal reasoning, within MMMU-Pro, Google’s model scores 81.2% compared to OpenAI’s 79.5%. In Video-MMMU, Flash achieves 86.9% compared to 85.9% for GPT-5.2 Extra High.

Gemini 3 Flash Final Benchmark Table Light 25 1 Original
Gemini 3 Flash Final Benchmark Table Light 25 1 Original

If we look at multilingual and cultural capabilities, Flash is again ahead, with 91.8% compared to 89.6% for GPT-5.2 Extra High. In Global PIQA, focused on common sense in 100 languages, the difference remains: 92.8% for Flash versus 91.2% for the OpenAI model. Everything indicates that Gemini 3 Flash is specially optimized to capture nuances outside of English and reason more fluently in global contexts.

He also excels in the use of tools and agents. In Toolathlon, Flash scores 49.4% compared to GPT-5.2 Extra High’s 46.3%. In the FACTS Benchmark Suite, the difference is tighter, but still in favor of Google: 61.9% versus 61.4%. In long-term tool execution tasks, Flash appears to show greater consistency.

But he is not the king of pure reasoning. Now, it is worth looking at the complete photo. Although Gemini 3 Flash outperforms the best OpenAI model in several tests, if you are looking for “pure” reasoning, the balance changes. In the most demanding tests in this area, GPT-5.2 Extra High continues to set the benchmark.

OpenAI’s model leads ARC-AGI-2, focused on visual puzzles, with 52.9% compared to Flash’s 33.6%. In AIME 2025, with code execution, it reaches 100% compared to 99.7%. And in SWE-bench Verified, aimed at software engineering, it obtains 80.0% compared to 78.0% for Gemini 3 Flash.

What exactly is GPT-5.2 Extra High. Throughout the article the name GPT-5.2 Extra High appears several times, and it is normal to wonder if it is something new or little known. In reality, it is not a model that is usually mentioned to the general public.

Google uses this designation in its comparison table to refer to the maximum level of reasoning available in the OpenAI API for GPT-5.2 Thinking and Pro. In the official OpenAI documentation it is identified as “xhigh”.

Where you can use Gemini 3 Flash. Access to Gemini 3 Flash is not country dependent. If you have access to the Gemini appyou are already using this model, which has become the default option. It is also reaching developers through the API, AI Studio and Vertex AI. In the United States, the deployment goes a step further, as the Gemini 3 Flash has become the default model of the AI Mode of the Google search engine.

The price of using Gemini 3 Flash. For those who want to integrate Gemini 3 Flash into their applications, the model costs $0.50 per million input tokens and $3 per million output tokens. This is a slight increase over Gemini Flash 2.5, which was $0.30 per million tokens in and $2.50 per million tokens out.

An increasingly tight race. Gone are the days when Google tried to confront ChatGPT with Bard, or when OpenAI seemed to be years ahead of the rest. Today, the distances between the big players in AI have been drastically reduced. The competition is more direct, more technical and, above all, much closer.

Images | Google

In Xataka | Amazon is preparing an investment of 10 billion in OpenAI because if you can’t beat your enemy, the best thing is to join him

Leave your vote

Leave a Comment

GIPHY App Key not set. Please check settings

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.