in

Claude Opus 4 launches and presents it as the best programming model in the world

After Google will display all its artillery in artificial intelligenceAnthropic did not want to be left behind. The company founded by Dario Amodei has moved tab strongly: has presented Claude Opus 4 and Claude Sonnet 4two new models with which he aspires to leave his mark on the race for AI.

The announcement star is Claude Opus 4, the most advanced model Anthropic has developed so far. And they do not walk with Rodeos: they assure that it is “the best programming model in the world. An ambitious statement that, as always, will have to be tested. But the first data places it very well positioned in front of its main rivals.

Bechmarks
Bechmarks

In the benchmark Swe-Bench Verified, which evaluates real software engineering tasks, Opus 4 gets 72.5 % in standard conditions and reaches 79.4 % if the Parallel processing. It is a performance that leaves it above models such as GPT-4.1 (54.6 %), O3 (69.1 %) or the recent Gemini 2.5 Pro of Google (63.2 %).

However, in other more demanding evidence in multimodal reasoning, such as GPQA Diamond or MMMU, focused on university level questions and complex scenarios that combine text and image, Opus 4 fails to overcome O3, which continues to lead in that field.

A model with resistance and autonomy

But beyond the numbers, what Anthropic wants to highlight is the resistance and autonomy of this model. Claude Opus 4 is capable of maintaining long work sessions and executing thousands of steps continuously. From the company they explain that this makes it an ideal basis for AI agents More sophisticated: systems that make decisions, complete tasks on their own and do not need constant human supervision.

In parallel arrives Claude Sonnet 4, an evolution of the model that Anthropic launched in February. It is not intended to compete with Power Opus, but it offers a very balanced proposal between performance and efficiency. In coding it also makes an important leap with respect to its previous version: it goes from 62.3 % to 72.7 % in Swe-Bench Verified, and improves in reasoning tasks, instructions monitoring and general precision.

Both models arrive with interesting news. For example, they can now alternate between reasoning and use of tools Within the same process, which allows more complete answers. They have also improved in reliability. According to Anthropic, they are 65 % less likely to take shortcuts or make serious mistakes than Sonnet 3.7.

Claude Opus 4 and Sonnet 4 are already available in the API of Anthropic, at Amazon Bedrock and Google Cloud Vertex AI. They are included in the Pro, Max, Team and Enterprise plans. Prices are kept in the line of the previous models: Opus 4 costs $ 15 per million input tokens and 75 per million departure tokens. Sonnet 4 is more affordable: 3 and 15 dollars respectively. The latter can also be used from free accounts.

Images | Anthropic

In Xataka | We have tried the new Google AI mode: it is a direct bullet to the blue links that worries and excites in equal parts

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Only three countries have launched human beings to space. A room is about to join the club: India

Do not get night on solar panels