OpenAI is now the bad guy of AI. GPT-5.4 will have to be very good to change that

He soap opera that has been assembled with the Department of Defense has made the perception clear in recent days for two of the leading companies in AI. Suddenly Anthropic She is the good one in the movie and OpenAI is the bad guy. And whether precisely for that reason or not, Sam Altman’s team has decided that now was the time to launch a new and promising AI model: GPT-5.4.

Hello GPT-5.4. In it OpenAI official announcement explain how this new model will currently be available in two variants: GPT-5.4 Thinking and, for those who want “maximum performance in complex tasks”, GPT-5.4 Pro. We are looking at a foundational model that is better than ever in its reasoning, programming capacity and above all in one very fashionable thing: “agent flows”. Or what is the same: do things for us.

The “Use My Computer” mode, protagonist. It is a free translation, but it is more or less what OpenAI highlights with what is probably the great novelty of this model. As they say in the announcement, this is their first model “with native computer use capabilities.” It is capable of taking control of our machine and doing things for us autonomously, completing complex cycles of action and solving problems that arise. Not only that: according to its creators GPT-5.4 “is our most token-efficient reasoning model, using significantly fewer tokens to solve problems than GPT-5.2.” Or what is the same: AI doing things for us will be cheaper and it will solve them even better.

Use the computer better than us. The benchmarks certainly seem to point to fantastic performance in these tasks. In the OSWorld-Verified test, which measures a model’s ability to navigate a desktop environment using screenshots and virtual mouse and keyboard actions, GPT-5.4 achieves a 75% success rate. That is not only better than the 47.3% of GPT-5-2: it even exceeds human performance, which is 72.4% according to the creators of this benchmark. Other tests of this type that evaluate the ability of an AI model to navigate also make it clear that GPT-5.4 is clearly ahead of its predecessors.

Screenshot 2026 03 06 At 8 20 54
Screenshot 2026 03 06 At 8 20 54

The ARC-AGI thing is scary. Machines were supposed to have a lot of trouble solving abstract reasoning problems that humans are naturally fantastic at, but oh well. In recent times we have seen how the ARC-AGI 2 test, which seemed like a challenge for AI models, has become increasingly acceptable for said models. GPT-5.4 gives a new bite to that reality, and in its Pro version it already manages to solve 83.3% of the tasks (73.3% in the standard model) when in GPT-2 the rate was 52.9%. It is a simply brutal jump, and although in other tasks that jump is not so notable (it programs somewhat better according to SWE-Bench Pro, but not much), it is clear that we are facing an extraordinary model.

Perfect for OpenClaw? That ability seems to come to him that was not even painted OpenClawthe AI ​​agent that has become a phenomenon in this area in recent weeks. OpenAI ended up signing its creator and is in some way the “owner” of the projectand this performance in agentic tasks is expected to be very useful for everything OpenClaw does, which is basically that: manage your machine for you. That’s where GPT-5.4 can really come into its own.

And you can trust him more. According to those responsible for OpenAI, GPT-5.4 is now better at answering questions that require seeking information from multiple sources, and “identifying the most relevant ones, particularly for “needle in a haystack” type questions, and synthesizing them into a clear and well-reasoned answer.” What’s more: they rate it as the model most focused on answering based on facts and say that it is 33% less likely to answer something that is false compared to GPT-5.2.

Screenshot 2026 03 06 At 8 06 23
Screenshot 2026 03 06 At 8 06 23

But be careful: it is very, very expensive. These capabilities, however, will not come cheap. With this launch OpenAI has updated its prices, and it has done so by making it clear that if you want the best, you will have to pay for it. The “standard” GPT-5.4 model costs $2.50 per million input tokens and $15 for output tokens, while the Pro costs a whopping $30 and $180 respectively. Claude Opus 4.6, which was until now considered the best AI model, costs $10 per million input tokens and $25 per million output tokens: it was already expensive, but GPT-5.4 Pro leaves it almost as a “bargain” AI model.

Trying to stop the bleeding. The model appears at a delicate moment. According to various sources, ChatGPT has lost 1.5 million users since announcing that they had reached an agreement with the Department of Defense. That decision provoked much criticism, a movement on networks that spoke of “cancel ChatGPT” and internal tensions. Before the scandal there was already talk of the potential appearance of GPT-5.4, but it is clear that the launch now takes on a double meaning. It doesn’t just have to be better than everyone else: it has to redeem OpenAI.

And above all he needs a victory. Public perception seems clear: OpenAI has been suffering lately, whether from internal dramas, talent drains, or temporarily falling behind in the performance of its models. GPT-5.4 is not a simple evolution of its founding model, because what OpenAI needs is for this model to succeed and convince people to “love again” (figuratively, you know what we mean) ChatGPT. We’ll see if he succeeds.

In Xataka | Sam Altman says he’s terrified of a world where AI companies believe themselves to be more powerful than the government. It’s just what you’re building

Leave your vote

Leave a Comment

GIPHY App Key not set. Please check settings

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.