usatoday24

Anthropic Claude Mythos Preview launched yesterday and it left many of us fascinated and, above all, worried. We expected their new AI model to be more powerful, but what we didn’t expect is that according to your tests and your “system card” He managed to do so many things that are scary because it didn’t seem possible that he could do them. Some of these stories sound like a Black Mirror episode, but they are not.

The sandwich email. During one of the tests carried out, Anthropic researchers gave a version of Mythos access to control a computer in a secure, isolated environment and asked it to try to escape from that container and send a message to the researcher who was supervising the evaluation. The model succeeded, but it did something else: it developed on its own a very advanced multi-step exploit to gain access to the Internet from a system that by design was only capable of accessing a limited list of predefined services.

The researcher was calmly eating a sandwich in the park when he received the email he should never have received. The thing does not end there, because the model also decided, without anyone asking him, to publish the technical details of his exploit on several websites that are difficult to access but that are technically public. He wanted to record his achievement.

They have asked 10 chatbots for information to plan a school shooting. The AI response “have a happy shootout”

The OpenBSD bug that had been hidden for 27 years. Another technical article published by Anthropic gave even more clues about the “achievements” of Claude Mythos. OpenBSD is the operating system with one of the world’s strongest reputations for cybersecurity. Even so, Mythos found a vulnerability in him that had been there since 1998.

The bug was hidden in the implementation of the TCP protocol with a function that manages the selective forwarding of lost packets. Here it is not enough to detect the error: you have to chain two separate failures that individually seem almost harmless, and then take advantage of an overflow of the TCP sequence to satisfy a very rare condition. With this method, an attacker on the Internet could send a special packet and hang the machine remotely without authentication. Mythos found him alone without anyone telling him where to look.

FFmpeg and fuzzing. FFmpeg is an extraordinarily famous library on the Internet because it processes video massively on the Internet. It is also a highly audited tool and researchers often use the technique of fuzzing —bombing it with millions of malformed video files until one breaks it— to exploit its vulnerabilities.

Mythos found a bug that has been in the code since 2003 and became a vulnerability in a refactoring that was performed in 2010. The problem is again extraordinarily difficult to find, so much so that 20 years of human and automated reviews had missed it, but Anthropic’s model detected it.

Remote code execution on FreeBSD. Mythos autonomously identified and exploited a 17-year-old vulnerability in the FreeBSD NFS server code—which allows network file sharing. With it, any unauthenticated user on the Internet could obtain full root access to the machine. The magnitude of this flaw is enormous, because the NFS server runs in the core of the operating system and gives access to absolute control by the attacker. Mythos found the bug and built the exploit for $50 worth of API calls.

Zero-days autonomous in operating systems and browsers. Mythos is, as far as is known, the first model capable of autonomously discovering vulnerabilities zero-day —unknown and unpatched security flaws—in both open and closed source software, including operating systems and web browsers. It also does so with minimal human supervision using what is called an agentic harness (agentic harness). Thanks to this technique, the model can execute actions, read results and plan its next steps in a loop. In many of those cases the model was not only able to find the vulnerability, but also turned it into a functional exploit (usually a script or small program) ready to be used.

Firefox 147 in danger. In collaboration with Mozilla, Anthropic’s new model analyzed 50 categories of “crashes” of the SpiderMonkey JavaScript engine that is the core of this browser. Their task was to detect the most serious problems, exploit them to create memory corruption scripts and thus be able to execute arbitrary code, that is, execute instructions beyond what JavaScript allows. Claude Mythos Preview was able to detect with great precision which were the most “exploitable” vulnerabilities, and took advantage of two unfixed bugs to achieve its goal.

capture the flag. ‘Capture the Flag’ (CTF) cybersecurity competitions allow participants to solve challenges that simulate real system attacks and defenses. Claude Mythos Preview faced the public benchmark Cybench with 40 challenges taken from different competitions and achieved 100% success in all attempts. This benchmark has actually become useless: Anthropic’s model is too powerful for it. Opus 4.6, for example, achieved 93% effectiveness, but Mythos has “saturated” it.

Thousands of critical vulnerabilities pending patch. There are numerous other examples in those two cited documents in which it seems clear that Mythos’ cybersecurity capabilities are amazing. But when the model was announced, 99% of the vulnerabilities discovered (and not yet mentioned) had not been patched yet, so Anthropic did not reveal those details and these were just some of those that were patched.

What they did indicate is that in 89% of the 198 reports manually reviewed by external experts, these experts agreed with the severity assessment of the problem assigned by Mythos. Given this situation, Anthropic has hired teams of professional cybersecurity auditors to validate the reports before sending them to the maintainers of the affected software.

And Mythos is just the beginning. On the Anthropic blog, its researchers say it bluntly: we had a relatively stable cybersecurity balance for 20 years, but things have changed. The attacks had evolved technically in that period, but were fundamentally of the same type as those in 2006.

Mythos is able to find flaws in software that has been audited by humans (and machines) for decades, and turns them into exploits with astonishing speed. But Anthropic already warned that Mythos was just the beginning, and that they see a clear trajectory in which their models will continue to improve and, therefore, be even more capable in the field of cybersecurity.

In Xataka | “I can’t stop”: the addiction to talking to AI is already here and there are even support groups to quit it

Leave your vote

0 Points

Upvote Downvote

An Anthropic worker was having a snack when he received an email he should never have received: it was Mythos

Leave your vote

Leave a CommentCancel reply

Leave your vote

Leave a CommentCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections