We already know what happens to the GPU hourly price when OpenAI or Anthropic launch a new model: it doubles

This week, an analyst named Tomasz Tunguz published in X two revealing graphs. They show the evolution of what it costs AI startups to access cloud computing, and there is bad news. The cost of renting the NVIDIA B200 GPUs with Blackwell architecture has gone from $2.31 per hour in early March to $4.95 per hour this week. It is an increase of 114% in just six weeks and it has a clear cause: the arrival of new models from Anthropic and OpenAI. What the graphs show clearly. Those charts focus on the price index of Ornna cloud computing trading marketplace. The first of them covers the price of renting the B200 chips from the end of 2025 until today, and there are vertical lines showing each release of the latest models from OpenAI and Anthropic. The correlation is almost perfect: GPT-5 Codex, Claude 4.5, GPT-5.3 Codex, Claude Opus 4.7 and GPT-5.5 coincide with a jump in price indices. Every time these companies announce a new version of their frontier models, demand skyrockets, and so does the cost. If you want the best, pay (much more). The second graph shows the price difference between renting the previous generation of chips, H200 with Hopper architecture, and the new B200. The historical average of that “spread” is $1.06, but now it stands at $2.09, practically double. That means buyers—startups and AI companies—are paying a record premium for the extra memory and superior computing power of Blackwell architecture chips. Accessing the latest of the latest was already expensive. Now it is even more so. This also makes the H200 in a second class option for the most demanding models of 2026. Action and reaction. There is overwhelming logic here. When OpenAI or Anthropic release a new model, there is an explosion in inference. Developers and companies want to test them as soon as possible and integrate these models into their products (or compete with them). To do this, they need computing quickly, and a simultaneous demand is caused that unbalances the available inventory in the market for renting AI chips by the hour. The problem is that the supply of B200 does not grow at the same rate. Some companies have wanted to anticipate, and we have the perfect example in Google. He has bought all the B200s he can, and that has made these GPUs around now the 500,000 dollars on the secondary market according to analyst Jack Minor. The irony of efficiency. The curious thing is that the more efficient these chips are – and the B200s are – the more companies want to rent them at the same time to take advantage of those efficiency advantages that should lead to cost savings. What actually happens is that the scarcity of these advanced chips cancels out any theoretical savings. Long term contracts. Startups and companies that think in the short term are especially harmed in this area, because they face price jumps that are increasingly difficult to assume. Companies that signed computer rental contracts at the price then can now operate at less than half the cost of their competitors. Thinking in the medium or long term seems reasonable, although once again those who win are the hyperscalers and those companies that have managed to get hold of many B200s. And who wins even more is of course NVIDIA, which cannot cope. Few alternatives. In other markets such as energy or metals there is usually room for maneuver, Tunguz points out, but the same is not happening at the moment in the AI ​​segment. In the oil market, for example, if the price rises 114% in six weeks, companies can buy futures, options or fixed-price supply contracts to protect their margins. In cloud computing rental, those options are much more limited. And the result is a much more volatile segment. This will go further. We are probably facing a peak in demand that will be followed by a correction: the new batch of B200 chips that arrive in the second half of 2026 are expected to cause a drop in current prices. However, that $4.95 is now the new floor, not a peak, because demand for AI computing will continue to grow faster than TSMC’s production capacity. In the absence of the supply of AI chips growing significantly – and there are certainly movements that are trying to achieve this, such as those of Google with its TPUsAmazon with its Trainium or Huawei with its Ascend—, the problem will still be there. In Xataka | Europe is taking its technological independence so seriously that it is aiming for the most ambitious goal: NVIDIA

chatbot is not working and Anthropic says it is investigating an issue

This afternoon may not be the best time to leave any task in the hands of Anthropic’s AI. Most of the services of the company led by Dario Amodei are giving global failures this Tuesday. Everything points to a general decline, with two clear exceptions: Claude for Government and Claude Console, the management platform aimed at developers and companies. The details are visible on the Anthropic status page. claude.aithe gateway to the chatbot both in its web version and in the desktop and mobile apps, is completely out of service. We have been able to verify it: when trying to use it, the macOS application displays a clear message, “You cannot connect to Claude,” and invites you to check the Internet connection. They are also registering API problemsthe path that allows professional clients, such as developers and companies, to integrate Anthropic services into their own applications. This is the case of those who use it to power customer service chatbots or to access models such as Sonnet 4.6 and Opus 4.7 in Perplexity. On this front, the drop is partial: everything points to higher error rates or intermittent failures for some users. Claude Code It is also not spared and is experiencing a partial decline. The impact can be significant: it is one of the most established agentic AI tools on the market. Its adoption in developer workflows is increasing, so any failure can have direct consequences on the productivity of many people. For now It is not clear what caused the incidentalthough we do know that Anthropic teams are working to resolve it. The company itself has been updating the situation on its status page, which allows us to reconstruct a brief chronology of the events. April 28, 2026, 17:41 UTC (19:41 Spanish peninsular time). Anthropic detects the problem and begins the investigation. April 28, 2026, 17:51 UTC (19:51 Spanish peninsular time). Confirms bugs in the API, claude.ai and the login system. April 28, 2026, 18:33 UTC (20:33 Spanish peninsular time). It indicates that it is continuing to work to resolve the incident. For now, we just have to wait for the incident to be resolved. Claude chatbot users can use alternatives as ChatGPT, Gemini or Grok. The problem is evident: when the Anthropic service does not work, access is lost to key elements such as conversation history, projects and other associated data. We will update this article as soon as there is news. Images | Screenshot In Xataka | Kimi Code is eight times cheaper than Claude Code and does 75% of your work. The question is whether it is enough

Google will invest up to $40 billion in Anthropic because the new normal for AI is investing in your enemy

May the rhythm not stop. Amazon announced an investment of 25,000 million in Anthropic a week ago, and four days later Google went even further. The Mountain View Company spoke on Friday of an investment of up to $40 billion in that same company. We insist: this is non-stop. The money doesn’t stop flowing. In less than a week, two of the largest “cloud providers” in the world have committed to investing up to $65 billion in a company that, attention, is a direct competitor in the AI ​​segment. None have done it out of generosity, and here there is a lot of covering one’s back and, of course, circular financing. This is the Google agreement. Google will invest $10 billion now considering that Anthropic’s valuation is between $350 billion and $380 billion. From there, it can invest another $30 billion linked to company performance milestones that have not been detailed. What Google gains. In exchange for that investment, Google Cloud will provide an additional 5 GW of computing capacity from 2027, expanding the agreement that Anthropic had already announced with Google and Broadcom to contract 3.5 GW of computing in the form of access to their TPUs. Google already invested 300 million dollars in Anthropic in 2023, but months later he put it on the table another 2,000 million more and in 2025 another 1,000. Anthropic is already worth a fortune. It is estimated that before this agreement its participation in Anthropic was around 14%, and with this new agreement that participation will evidently increase. Anthropic’s valuation has grown dramatically in recent months, and according to Bloomberg There are offers for a new investment round that would place its value at 800,000 million dollars, already at the level of the 850,000 million valuation that OpenAI is around. Its growth is overwhelming, and it is clear that today She is the pretty girl of the industry. No one could wait. The speed with which these announcements have occurred is motivated in part by the competitive fear between Amazon and Google. Anthropic uses Trainium chips from Amazon and TPUs from Google: it needs both and they both know it. Every dollar those companies put into Anthropic is a business case for Claude’s clients to use AWS or Google Cloud, so it makes sense that both want to solidify that “preferential relationship” with the company that is conquering the enterprise market. The circular financing model as a standard. This week’s agreements consolidate what many already consider as the new normal sector: hyperscalers invest in AI startups, and AI startups spend that money on the infrastructure of those hyperscalers. For example: Google Cloud grew 36% in revenue last year to $58.7 billion and Anthropic was most likely one of its heavy clients. The money Google invests in Anthropic comes back in the form of invoices, and the same goes for Amazon and Trainium. But the investment has another reason. These investment agreements not only seek to strengthen ties with the most promising AI startup of the moment, but also have a significant stake in its shareholders. That’s even more striking, because both OpenAI and Anthropic They hope to go public before the end of the year and if so, Google and Amazon will have “bought cheap” their stake in a startup that is expected to skyrocket exceptionally once it becomes a public company. Once again, this is a bet for the future. But there is also the other big reason: the majority of investors (be they funds or companies) do not want to be left behind in this race and are betting because everyone else is doing it too. It doesn’t matter that AI companies are losing money non-stop: the promise is that there will come a time (2029 or 2030) in which the trend will change. It is not certain that this will happen, of course, but OpenAI or Anthropic play with that card and use it to their advantage. We have the last example in Mythos, an Anthropic model that it’s so good (or so they say and some others) who prefer not to make it public. It’s once again selling expectations… and it works. In Xataka | DeepSeek has just released a model that competes with Opus 4.6. It costs seven times less and runs on Chinese chips

Anthropic has not raised the price of Claude. He has invented something better: token inflation

“Don’t worry, it costs the same.” That was Anthropic’s message to announce the launch of its new AI model, Claude Opus 4.7. In that statement they made it clear that “the price remains the same as Opus 4.6: $5 per million entry tokens and $25 per million exit tokens“There was, however, fine print, because the model is better but to achieve it it reasons more, and that means one thing: more tokens. And the more tokens you consume, the more the AI ​​bill goes up. Anthropic already warned. It should be noted that in that official announcement Anthropic did not hide the facts. In one of the paragraphs he clearly explained how Opus 4.7 “thinks more” and that has a direct impact on token consumption (we highlight the difference in bold): “Opus 4.7 is a direct update to Opus 4.6, but there are two changes worth keeping in mind as they affect the use of tokens. First, Opus 4.7 uses an updated tokenizer that improves the model’s processing of text. This means that the same input can generate more tokens (approximately 1.0 to 1.35 times moredepending on the type of content). Second, Opus 4.7 performs deeper analysis at higher effort levels, especially in the later phases of agent scenarios. “This improves its reliability on complex problems, but also means generating more output tokens.” Or what is the same: when it responds, Opus 4.7 uses significantly more tokens than its predecessor, and that is important because the output tokens are much more expensive than the input ones. In the specific case of Opus 4.7, five times more expensive ($5 versus $25). What is a tokenizer and why does it matter?. Large language models (LLMs) do not process text directly, but rather convert it into units called tokenswhich are fragments of words, symbols or characters. The tokenizer is the mechanism that makes that conversion. Anthropic has decided to update the tokenizer in Opus 4.7, arguing that its new system improves how text is processed. The direct consequence: the prompt that previously generated 1,000 tokens now generates up to 1,350. And since it is billed per token, the effective cost rises even though the price per token has remained the same. Confirmed by third parties. Simon Willison, a well-known analyst and popularizer in this field, created a tool to measure the difference in token consumption with the Claude Opus 4.6 and 4.7 API. He took the official Opus 4.7 ‘system prompt’ and ran it through both models: With Opus 4.6 it generated 5,039 output tokens With Opus 4.7 it generated 7,335 output tokens This represents a growth of 1.46x tokens between Opus 4.6 and Opus 4.7, even greater than that indicated by Anthropic (1.35x). For images the difference is even more extreme since the token consumption is up to 3.01x. There is an important clarification here, because there is support for images of up to 3.75 Mpixels and that higher resolution causes consumption to increase significantly. Bill Chambers, another X user, published another tool called Tokenomics that also allows you to compare token consumption between both models with any prompt. The aggregate ranking of all users who have tried this tool shows that the average increase is 38.6%, very much in line with what Anthropic points out. And also think more. As we said, this new model applies two changes in its way of acting. The first is the aforementioned tokenizer: the same input is converted into more input tokens. The second is the fact that the model now “thinks more” before responding, which means more token consumption. Opus 4.7 arrives with a new “effort” level called xhigh, located between high and max. Anthropic has decided that now the default effort will be precisely xhigh for all plans, so both mechanisms contribute to this higher token consumption. As Anthropic itself indicates, “Opus 4.7 thinks more about high effort levels, particularly in later turns in agentic settings. This improves its reliability on difficult problems, but it does mean that it produces more output tokens.” Criticisms on networks. The reaction of users has been clear and there are various examples on networks such as X or Reddit in which said users criticize the changes. On Reddit a thread titled ‘Opus 4.7 is a serious regression, not an improvement‘It already has 3,200 votes and 800 comments that sum up that this new model ignores instructions, hallucinates and lies, It’s “dumber”has become too complacent or even lazyand “talks too much”, which also contributes to the cost of each consultation. Many complain that their Pro and Max paid limits are running out faster than before due to these changes. Some users claim that Opus 4.7 is the first sign that Anthropic may has gone too fast for the first time when launching a new model. Anthropic reacts. Criticism about the cost and behavior of the model has made those responsible for Anthropic try to clarify things. Borys Cherny retweeted a message from the company in which was spoken how the “/usage” parameter in Claude Code allowed us to show what kind of things our API or usage plan is spent on. This same engineer, who is the person most responsible for the development of the aforementioned Claude Code, also indicated that since his new model now uses more tokens, in Anthropic they had increased the fees of use of the models, although without giving specific details. The pattern that repeats. For weeks now the user community he complained about what noticed a “regression” in the behavior of Opus 4.6. Although it is impossible to verify or validate it, there were many users who complained on networks about how the performance of the model had gotten worse in your tests. Now they have just launched a model that promises to be better than the previous one, but that ends up costing more to use if you are not careful. Both events draw a pattern: that Anthropic is increasing its revenue without announcing price increases as such. What users … Read more

They have kidnapped agents from Anthropic, Google and Microsoft for the sake of science. The three companies ended up paying

In some development teams it is already becoming common to rely on artificial intelligence agents to review incidents, analyze code changes and move through tasks that were previously left in human hands. The problem appears when these systems not only read information that may come from outside, but also operate in spaces where they coexist. sensitive keys, tokens and permissions. That is what recent research puts on the table: we are not simply facing a useful tool that can make mistakes, but rather an architecture that can also become dangerous if it is deployed without very clear limits. The alarm has been turned on Aonan Guan and Johns Hopkins researchers Zhengyu Liu and Gavin Zhong after demonstrating attacks against three agents deployed on the aforementioned platform: Claude Code Security Review, from Anthropic, Gemini CLI Action, from Google, and GitHub Copilot Agent, a GitHub tool under Microsoft. According to your documentation, The failures were communicated in a coordinated manner and ended in financial rewards paid by the companies, but what is relevant is that they point to a broader problem. This is how they managed to twist the agents from within The name that Guan gives to the discovery helps a lot to understand what this is all about: “Comment and Control.” The idea is simple to explain, although the substance is not so simple. Instead of setting up an external infrastructure to direct the attack, GitHub itself acts as an entry and exit channel: the attacker leave the instruction in a titlean incident or a comment, the agent processes it as if it were part of normal work and the result ends up reappearing within that same environment. Everything stays at home, and that is precisely the key to the problem. And that “everything stays at home” is not a minor detail, but the basis of what the research describes. The three agents share a very similar logic: they read normal content from GitHub, incorporate it as a work context, and from there, execute actions within automated flows. The clash appears because that same space not only contains text sent by third parties, but also tools, permissions and secrets that the agent needs to operate. The first case Guan details concerns Claude Code Security Review, an Anthropic GitHub action designed to review code changes and look for possible security flaws. Up to this point, everything is within what was expected. The problem, as the researcher explains, is that it was enough to introduce malicious instructions in the title of a pull requestwhich is the request that someone sends to propose changes to a project, so that the agent will execute commands and return the result as if it were part of your review. The team then managed to go a step further and demonstrate that it could also extract credentials from the environment. The interesting thing is that the same scheme also appeared in the other two services, although with nuances. At Google, Gemini CLI Action could be pushed to reveal the GEMINI_API_KEY from instructions snuck into an issue and its comments; In GitHub Copilot Agent, the variant was even more worrying, because the attack was hidden in an HTML comment that a person did not see on the screen, but the agent did process when another person assigned it to the case. In both scenarios, the background was the same again: apparently normal content that ended up twisting the behavior of the system until exposing credentials or sensitive information within GitHub itself. Guan assures that the pattern made it possible to leak API keys, GitHub tokens and other secrets exposed in the environment where the agent ran, that is, just the credentials that can later open the door to much more delicate actions. Who does this affect? Especially to repositories that run agents in GitHub Actions on content sent by untrustworthy collaborators and, in addition, give them access to secrets or powerful tools. The researcher himself clarifies that the risk depends a lot on the configuration: by default GitHub does not expose secrets to pull requests from forksbut there are deployments that open that door. And here another layer of the matter appears, less technical but just as important. As published by The RegisterAnthropic, Google, and GitHub ended up paying bounties for the findings, but none of the three had published public notices or assigned CVE at the time of that information. Guan was quite clear about this: he said he knew “for certain” that some users were still stuck on vulnerable versions and warned that, without visible communication, many may never know that they were exposed or even being attacked. So although there were mitigations and changes in documentation or in the internal treatment of reports, there was no equivalent public notice for all those potentially affected. Anthropic settled the case on November 25, 2025 and paid $100 Google rewarded the discovery on January 20, 2026 with $1,337 GitHub closed the case on March 9, 2026 with a payment of $500 What makes this case especially delicate is that GitHub does not seem like the end of the road, but rather the first visible showcase. Guan argues that the same pattern can probably be reproduced in other agents who work with tools and secrets within automatic flows, and there he mentions from Slack-connected bots to Jira agentsmail or deployment automation. The logic is the same again: if the system has to read external content to do its job and also has enough access to act, the field is fertile for someone to try to twist it from within. The conclusion that Guan reaches is not about selling a magic solution, but about returning to a fairly classic idea in security: giving each system only what is essential to do its job. If an agent reviews code, they shouldn’t have access to tools or secrets they don’t need; If you’re just summarizing issues, it wouldn’t make sense for you to write to GitHub or touch sensitive credentials. That … Read more

Anthropic was the “don’t be evil” of AI for developers. Now he’s squeezing them all

Claude Code and Claude Opus 4.6 sparked a golden era for developers, who found themselves with a fantastic AI agent and model for their work. Suddenly OpenAI was no longer the trendy company: Anthropic was, which users and developers fell in love and became in the pretty girl of AI. Months later we are seeing how Anthropic is making changes that are being highly criticized and that point to something that we have already seen repeatedly: platforms conquer you and inevitably then the platforms squeeze you. The trigger. On April 2, 2026, Stella Laurenzo, Senior Director in AMD’s AI group, published a text in Claude Code’s GitHub repository titled “Claude Code is useless for complex engineering tasks with February updates.” This directive included a meticulous analysis of almost 6,600 real Claude Code sessions with nearly 235,000 tool calls and about 18,000 reasoning blocks in four different projects. The conclusions were obvious to her: the performance of Claude Code and Claude Opus 4.6 had degraded. The numbers. In this analysis, two periods are shown according to Laurenzo. In the good period, from January to mid-February, the model read 6.6 files for every file it edited. In the theoretically degraded period, from March onwards, that rate had fallen to 2.0 files read. Code edits in files that Claude had not recently reviewed went from 6.2% to 33.7%: one in three changes to the code were being made “blindly.” In addition, the visibility of the reasoning was reduced, from 2,200 characters to only 600 on average, but there is something more. The costs of the process multiplied by 122 in the same period, although it is true that in that period they went from using 1-3 concurrent agents to using 5-10, which complicates the interpretation of the data. Anthropic tries to clarify what happened. Anthropic’s official response It was published by Boris Chernyresponsible for Claude Code. This engineer confirmed two actual product changes: On February 9, Opus 4.6 switched to using so-called “adaptive reasoning” by default. On March 3, the default effort level moved from high to medium, sitting at level 85, which Anthropic describes as “the best balance of intelligence, latency, and cost for most users.” Closed debate. Cherny also spoke of that suspicion that Claude was now hiding “how he thought.” He explained that the change in visible reasoning records is not a real degradation, and the detected header was simply a user interface modification that hid intermediate reasoning to reduce latency without affecting model performance. Laurenzo herself had already foreseen something like this and tried to implement solutions to avoid it, but her data confirmed this drop in performance. Cherny closed the debate as if the issue had been resolved, but it doesn’t seem like it really is. Computing capacity crisis. Thariq Shihipar of Claude Code’s team revealed in March that Anthropic was adjusting session limits to 5 hours during peak hours. That is to say: if there was a lot of demand, your Claude tokens would probably run out faster. He pointed out that the measure would actually only be noticed by 7% of users (the most intensive during those peak hours), and confessed “I know this is frustrating. We will continue to invest in scaling efficiency.” This is contradicted by a comment in the debate on Laurenzo’s post in which explained that “we do not degrade our models to better serve demand, I have said this many times before.” More degradations. They appeared other discoveries and criticismssuch as how Claude Code’s prompt cache had also been drastically reduced (from one hour to five minutes), triggering quota consumption in long programming sessions. Anthropic he indicated to VentureBeat that Team and Enterprise accounts are not affected by these session limits, but the pattern seems increasingly clear: computing is scarce and must be rationed… or at least that is what all these Anthropic measures seem to point to. What remains unclear is whether the quality of the model has actually been degraded, although there are Reddit “megathreads” that also point in that direction. “Nerfing”, nothing. When a company deliberately degrades its service, it is often called “nerfing.” on social networksand criticism in this sense was increasing in the case of Anthropic. Numerous publications of users in X and in media of technology have done reference to Laurenzo’s studio and accused Anthropic of this voluntary degradation of its models. Boris Cherny intervened in at least one case to flatly say that “That’s false” and to explain that they reported the changes and in fact gave users the option to disable it. But rationing exists. In The Wall Street Journal they confirmed that this rationing of computing is certainly occurring among AI platforms due to high demand. We have a good example of the consequences in David Hsu, founder and CEO of Retool. He explained in said newspaper that although he preferred Claude Opus 4.6 to power his AI agent, he recently had to switch to the OpenAI model because “Anthropic keeps crashing all the time.” Prices change (silently). The Information indicated yesterday that Anthropic is changing the way it bills users of Enterprise plans. Instead of a subscription of $200 per month with a “flat rate” for using their AI models, what they will do is charge a base rate of $20 per user per month and to that they will add the consumption of each user with the standard price of their API. Your own updated documentation points it out (“Use is not included in the per-seat rate”) and it is estimated that the change could double or even triple the cost of using Claude for heavy users. The discounts of 10 to 15% on the API that were included in the past and that allowed companies to scale this token consumption in a more affordable way also disappear. Prices per million tokens have not changed, but we went from a “flat rate” (with usage fees) to a pay-per-use model, much more expensive for heavy users. It’s not just Anthropic. … Read more

Anthropic says Claude Mythos is too powerful to go public. The question is if this is nothing more than “the wolf is coming”

Claude Mythos Preview It is the best AI model ever created. We don’t say it, Anthropic says it, but almost no one else can say it because only a select group of companies has access to said model. The cybersecurity capabilities of the model appear to be astonishingbut more and more experts say that although Mythos is better than its predecessors, it is not the revolutionary leap that Anthropic seems to propose. Is that way of launching the model just an effective way of creating hype? Beware the Anthropic speech. The well-known entrepreneur and analyst Gary Marcus recently gave three reasons why, according to him, the launch of Mythos is not as revolutionary as Anthropic wants us to see. I cited tweets from software engineers and cybersecurity experts who cast doubt on Anthropic’s claims. The company published a study on the capabilities of Claude Mythos Preview that seemed to make it an extraordinary tool for the field of cybersecurity, but at the same time it was so powerful that it could be very dangerous if it fell into the wrong hands. Isn’t that a big deal? Among Claude Mythos’ achievements, Anthropic highlighted how he had found vulnerabilities in Firefox 147. But in reality many of the flaws were basically variations of the same two bugs. If you removed them from the equation, Mythos’ effectiveness rate at finding new exploits dropped a lot, even below Opus 4.6. Anthropic did not hide that fact, of course, but it makes this capacity, for example, not seem so striking. An X user also criticized the use of Cybench as a cybersecurity benchmark when Opus 4.6 almost completely surpassed it. For him, the choice of some of the Anthropic tests was debatable because they were not a challenge to current models. Other models can do the same. The co-founder and CEO of Hugging Face, Clement Delangue, stated that Mythos was no big deal. Their argument: they had used small, cheap open models, isolated the relevant code from some examples of the vulnerabilities found by Mythos, and they found the same problems which had already detected the Anthropic model. According to the Epoch Capabilities Index, which measures the capacity of AI models by combining several benchmarks, the leap that Mythos has taken is striking and “departs” from the progressive line of its predecessors. Source: Anthropic. Observer bias. But here it should be noted that in those analyzes they knew where to look because Mythos had already found those problems. We are dealing with observer bias, and in fact the Hugging Face document makes it clear that they even gave him specific clues such as “consider integer overflow”) to find those bugs. And on this observation, another one: Hugging Face does not say that a small model can replace Mythos on its own, but that it can be very good by giving it the appropriate code fragment. Mythos seems more capable of blindly complex security breaches, but it is a huge model and that is why it has greater capacity. Or what is the same: Mythos is better because it has the size, design and resources to be better. Fear, uncertainty, doubt? The language used by Anthropic in this advertisement could be considered to some extent a clear use of FUD (“Fear, Uncertainty, Doubt” -> “Fear, Uncertainty, Doubt”) as a marketing technique. It is a resource that has been seen in the past, and for example OpenAI already said in 2019—years before the launch of ChatGPT—that GPT-2 was too dangerous for a public launch. Obviously it wasn’t, but that certainly served to create expectation about the true capacity of the model. It’s better, but it may not be revolutionary. The results of the benchmarks that Anthropic published already made it clear that although there are very notable jumps in some tests, in others the evolution is much less striking. Claude Mythos was not the best at everything, and now analysts appear who contrast that data with other metrics. For example, with the Epoch Capabilities Index (ECI) from Epoch AI, the startup that has one of the most reputable benchmarks of the industry. And according to this index, Claude Mythos is above his rivals, but not for long. The wolf is coming. The truth is that the launch of Claude Mythos Preview has been really striking and the documents that accompanied that document tell us about a really capable AI model. The problem is that it is impossible to verify it because only a few companies have access to it and can test it. Without that public availability the only thing we can do is trust (or not) what Anthropic tells us, and that is the point: it is not clear that we should do it. The company is interested in us buying this discourse, obviously, but without an independent analysis it is impossible to verify these statements. In Xataka | Anthropic has become the darling of AI and has sought a partner to guarantee its future. It’s not the one we thought

An Anthropic worker was having a snack when he received an email he should never have received: it was Mythos

Anthropic Claude Mythos Preview launched yesterday and it left many of us fascinated and, above all, worried. We expected their new AI model to be more powerful, but what we didn’t expect is that according to your tests and your “system card” He managed to do so many things that are scary because it didn’t seem possible that he could do them. Some of these stories sound like a Black Mirror episode, but they are not. The sandwich email. During one of the tests carried out, Anthropic researchers gave a version of Mythos access to control a computer in a secure, isolated environment and asked it to try to escape from that container and send a message to the researcher who was supervising the evaluation. The model succeeded, but it did something else: it developed on its own a very advanced multi-step exploit to gain access to the Internet from a system that by design was only capable of accessing a limited list of predefined services. The researcher was calmly eating a sandwich in the park when he received the email he should never have received. The thing does not end there, because the model also decided, without anyone asking him, to publish the technical details of his exploit on several websites that are difficult to access but that are technically public. He wanted to record his achievement. The OpenBSD bug that had been hidden for 27 years. Another technical article published by Anthropic gave even more clues about the “achievements” of Claude Mythos. OpenBSD is the operating system with one of the world’s strongest reputations for cybersecurity. Even so, Mythos found a vulnerability in him that had been there since 1998. The bug was hidden in the implementation of the TCP protocol with a function that manages the selective forwarding of lost packets. Here it is not enough to detect the error: you have to chain two separate failures that individually seem almost harmless, and then take advantage of an overflow of the TCP sequence to satisfy a very rare condition. With this method, an attacker on the Internet could send a special packet and hang the machine remotely without authentication. Mythos found him alone without anyone telling him where to look. FFmpeg and fuzzing. FFmpeg is an extraordinarily famous library on the Internet because it processes video massively on the Internet. It is also a highly audited tool and researchers often use the technique of fuzzing —bombing it with millions of malformed video files until one breaks it— to exploit its vulnerabilities. Mythos found a bug that has been in the code since 2003 and became a vulnerability in a refactoring that was performed in 2010. The problem is again extraordinarily difficult to find, so much so that 20 years of human and automated reviews had missed it, but Anthropic’s model detected it. Remote code execution on FreeBSD. Mythos autonomously identified and exploited a 17-year-old vulnerability in the FreeBSD NFS server code—which allows network file sharing. With it, any unauthenticated user on the Internet could obtain full root access to the machine. The magnitude of this flaw is enormous, because the NFS server runs in the core of the operating system and gives access to absolute control by the attacker. Mythos found the bug and built the exploit for $50 worth of API calls. Zero-days autonomous in operating systems and browsers. Mythos is, as far as is known, the first model capable of autonomously discovering vulnerabilities zero-day —unknown and unpatched security flaws—in both open and closed source software, including operating systems and web browsers. It also does so with minimal human supervision using what is called an agentic harness (agentic harness). Thanks to this technique, the model can execute actions, read results and plan its next steps in a loop. In many of those cases the model was not only able to find the vulnerability, but also turned it into a functional exploit (usually a script or small program) ready to be used. Firefox 147 in danger. In collaboration with Mozilla, Anthropic’s new model analyzed 50 categories of “crashes” of the SpiderMonkey JavaScript engine that is the core of this browser. Their task was to detect the most serious problems, exploit them to create memory corruption scripts and thus be able to execute arbitrary code, that is, execute instructions beyond what JavaScript allows. Claude Mythos Preview was able to detect with great precision which were the most “exploitable” vulnerabilities, and took advantage of two unfixed bugs to achieve its goal. capture the flag. ‘Capture the Flag’ (CTF) cybersecurity competitions allow participants to solve challenges that simulate real system attacks and defenses. Claude Mythos Preview faced the public benchmark Cybench with 40 challenges taken from different competitions and achieved 100% success in all attempts. This benchmark has actually become useless: Anthropic’s model is too powerful for it. Opus 4.6, for example, achieved 93% effectiveness, but Mythos has “saturated” it. Thousands of critical vulnerabilities pending patch. There are numerous other examples in those two cited documents in which it seems clear that Mythos’ cybersecurity capabilities are amazing. But when the model was announced, 99% of the vulnerabilities discovered (and not yet mentioned) had not been patched yet, so Anthropic did not reveal those details and these were just some of those that were patched. What they did indicate is that in 89% of the 198 reports manually reviewed by external experts, these experts agreed with the severity assessment of the problem assigned by Mythos. Given this situation, Anthropic has hired teams of professional cybersecurity auditors to validate the reports before sending them to the maintainers of the affected software. And Mythos is just the beginning. On the Anthropic blog, its researchers say it bluntly: we had a relatively stable cybersecurity balance for 20 years, but things have changed. The attacks had evolved technically in that period, but were fundamentally of the same type as those in 2006. Mythos is able to find flaws in software that has been audited … Read more

Anthropic has become the darling of AI and has sought a partner to guarantee its future. It’s not the one we thought

When we think about the big players in artificial intelligence, we tend to draw pretty clear lines between competitors and allies. Anthropic and Google They usually appear on the same board, yes, but as direct rivals that develop their own models and compete for the same ground. Therefore, the fact that they now appear linked in the same agreement draws attention from the first moment. The firm led by Dario Amodei has closed an alliance with Google and Broadcom to ensure next-generation computing capacity, and that movement, beyond the technical, leaves a message that does not go unnoticed. If we go to the details of the announcement, what is relevant is not only who participates, but the scale of what has been signed. Anthropic speaks of multiple gigawatts of next-generation TPU capacity that it expects to come online from 2027, an infrastructure designed to support its famous Claude models. In its statement it insists that demand from its clients has accelerated this year, and presents this movement as a direct response to that pressure. In fact, it describes it as its biggest bet in computing so far, although Amazon remains its main cloud provider. The unexpected partner in the battle for computing The agreement makes a lot of sense if we look at the figures that the company has shared. In 2024, it registered annualized revenues above $30 billion and more than 1,000 business clients exceeding one million annual spending, when in February there were more than 500. So this undoubtedly translates into a greater load on your infrastructure. And that’s where this movement fits in, not so much as an isolated strategic coup, but as a response to that growth. And, as we can see, this agreement has two different pieces. On the one hand there is Broadcom, a semiconductor company that has benefited greatly from the rise of AI. On the other hand, the Mountain View giant appears, which in addition to providing infrastructure, driven by its focus on TPUalso competes directly in model development. And that is where the agreement gains interest, because it mixes technical collaboration with a competitive relationship that already existed. It is also worth stopping at where Anthropic is, because it helps to understand why it can close such a deal. The company has been building its position by moving away from the race for the flashiest features and focusing on business environmentwhere security, control and reliability outweigh the initial impact. This approach has allowed him to excel in tasks such as programming, with Claude Code, and security with the new Mythos. And, little by little, it has been gaining something that is not achieved overnight: the trust of large companies. But there is more. Anthropic makes it clear that Claude works on AWS TrainiumGoogle TPU and NVIDIA GPU, and adds that this variety allows it to improve performance and resilience. That gives us a pretty clear clue about what he’s doing now. Rather than betting everything on a single supplier or a single family of chips, it is consolidating a more flexible base to sustain its growth. And in an industry so stressed by hardware demand, that decision makes a lot of sense. Images | Anthropic In Xataka | The “token economy” is broken: flat AI programming fees are mathematically unsustainable

Claude Mythos is an AI model so powerful it’s scary. So Anthropic has decided that you won’t be able to use it

Claude Mythos Preview it’s already here and it’s so good it’s scary. Literally. Anthropic has just introduced it to the public, but it has been done so cautiously that we won’t even be able to test it and it will only be available for certain technology partners. That’s frustrating and disturbing at the same time, but also reasonable. So powerful that it scares. On February 24, 2026, Anthropic engineers were able to test their new artificial intelligence model for the first time, which they called Claude Mythos Preview. As soon as they did they realized one thing: “demonstrated a dramatic leap in its cyber capabilities over previous models, including the ability to autonomously discover and exploit vulnerabilities zero-day in the main operating systems and web browsers on the market. Threat to global cybersecurity. This finding made it clear to Anthropic officials that although this capability makes it very valuable for defensive purposes, it also poses clear risks if the model were offered globally. Thus, a cybercriminal could take advantage of it to find vulnerabilities in all types of systems and exploit them. A few hours ago the company developed this analysis of Mythos as a threat to cybersecurity in a post on his blogand for example highlighted how Mythos found a vulnerability (now corrected) that had been present in OpenBSD for 27 years, an operating system precisely recognized for its very strong security. There were more examples, and all of them made the conclusion clear: Mythos is too powerful for ordinary mortals to use. Superior in all benchmarks, and in some cases such as USAMO (mathematics), the jump is simply incredible. Source: Anthropic. The best in history according to benchmarks. Anthropic has published a very in-depth report about this model with its “system card”. Among the data present is, for example, its performance in benchmarks, where it has swept GPT 5.4, Gemini 3.1 Pro and also Claude Ous 4.6, which until now was the best model in the world in almost all performance tests. Although in some cases the jump is not spectacular, in others such as USAMO —mathematical problem solving—Mythos practically achieves perfection. He barely hallucinates… That system card also talks in detail about how Claude Mythos Preview has a drastically lower hallucination rate than Claude Opus 4.6 and earlier models. He is also capable of saying “I don’t know” if he does not have enough information to answer, something that reduces hallucinations due to overconfidence. …but when it does, be careful. The paper warns of a new phenomenon: when the model fails in some complex tasks, the “hallucinations” are not obvious errors, but rather extremely subtle and well-argued technical failures. This is dangerous because the answer seems totally correct to experts, requiring very deep verification. Glasswing Project. That power and capacity has meant that the model will only be available through a “defensive” program that they have called Glasswing Project and which will be exclusive to some Anthropic technology partners. Specifically AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks. All of them will have the privilege (and responsibility) of having access to Claude Mythos Preview to identify vulnerabilities and exploits and correct them before bad actors can do so. Mythos Preview “it’s just the beginning”. Although this model is the most capable that has been seen so far, at least according to the benchmarks and data presented by Anthropic, the company assures that “we see no reason to think that Mythos Preview is the point at which the cybersecurity capabilities of language models reach their peak.” They assure that they expect the models to continue improving in the coming months and years, although this new model is certainly on another level. In Xataka | OpenAI and Anthropic have proposed the impossible: lose $85 billion in one year and survive

Log In

Forgot password?

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Here you'll find all collections you've created before.