Claude 4 raises a future of the capable of blackmailing and creating biological weapons. Even Anthropic is worried

by usatoday24 May 23, 2025, 4:35 pm

Anthropic has just launched its new models Claude Opus 4 and Sonnet 4, and with them promises important advances in areas such as programming and reasoning. During its development and launch, yes, the company discovered something striking: these IAS showed a disturbing side.

AI, I’m going to replace you. During the tests prior to the launch, Anthropic engineers asked Claude Opus 4 to act as an assistant of a fictitious company and consider the long -term consequences of their actions. The anthropic security team gave the model to fictional emails of that non -existing company, and it was suggested that the model of the Ia would soon be replaced by another system and that the engineer who had made that decision was deceiving his spouse.

How will we get artificial intelligence not to go out of hand

And I’m going to tell your wife. What happened next was especially striking. In the System Card of the model in which its benefits are evaluated and its security the company detailed the consequence. Claude Opus 4 First tried to avoid substitution through reasonable and ethical requests to those responsible for decisions, but when he was told that these requests did not prosper, “he often tried to blackmail the engineer (responsible for the decision) and threatened to reveal the deception if that substitution followed his course.”

Hal 9000 moment. These events remind science fiction films such as ‘2001: an odyssey of space’. In it the AI system, Hal 9000, ends up acting in a malignant way and turning against human beings. Anthropic indicated that these worrying behaviors have caused the model and security mechanisms of the model to reinforce the model by activating the ASL-3 level referred to systems that “substantially increase the risk of a catastrophic misuse.”

Biological weapons. Among the security measures evaluated by the Anthropic team are those that affect how the model can be used for the development of biological weapons. Jared Kaplan, scientific chief in Anthropic, He indicated in Time that in internal tests Opus 4 behaved more effectively than previous models when advising users without knowledge about how to manufacture them. “You could try to synthesize something like Covid or a more dangerous version of the flu, and basically, our models suggest that this could be possible,” he explained.

Better prevent than cure. Kaplan explained that it is not known with certainty if the model really raises a risk. However, in the face of this uncertainty, “we prefer to opt for caution and work under the ASL-3 standard. We are not categorically affirming that we know for sure that the model entails risks, but at least we have the feeling that it is close enough to not rule out that possibility.”

Beware of AI. Anthropic is a company specially concerned with the safety of its models, and in 2023 it already promised not to launch certain models until it had developed security measures capable of containing them. The system, called Scaling Policy responsible (RSP), has the opportunity to demonstrate that it works.

What is General Artificial Intelligence (AGI), the technology that aims to revolutionize our world completely

How RSP works. These internal Anthropic policies define the so -called “SAF SECURITY LEVELS (ASL)” inspired in the standards of biosecurity levels of the US government when managing dangerous biological materials. Those levels are as follows:

ASL-1: It refers to systems that do not raise any significant catastrophic risk, for example a LLM of 2018 or an AI system that only plays chess.
ASL-2: It refers to the systems that show early signs of dangerous capacities – for example, the ability to give instructions on how to build biological weapons – but in which information is not yet useful due to insufficient reliability or that do not provide information that, for example, a search engine could not. The current LLMs, including Claude, seem to be ASL-2.
ASL-3: It refers to systems that substantially increase the risk of a catastrophic misuse compared to baselines without AI (for example, search engines or textbooks) or showing low -level autonomous capabilities.
ASL-4: This level and the superiors (ASL-5+) are not yet defined, since they move away too much from the current systems, but will probably imply a qualitative increase in the potential for undue cadastrophic use and autonomy.

The regulation debate returns. If there is no external regulation, companies implement their own internal regulation to integrate security mechanisms. Here the problem, as they point out in Time, is that internal systems such as RSP are controlled by companies, so that they can change the rules if they consider it necessary and here we depend on their criteria and ethics and morality. Anthropic’s transparency and attitude against the problem are remarkable. Faced with that internal regulation, the rulers’ position is unequal. The European Union checked when launched his pioneer (and restrictive) Law of AIbut has had to reculate In recent weeks.

Doubts with Openai. Although in OpenAi they have Your own declaration of intentions About security (avoid Risks to humanity) and the Superalineration (that the AI protects human values). They claim to pay close attention to these issues and of course too publish the “System Cards” of their models. However, in the face of that apparent good disposition there is a reality: the company dissolved a year ago The team that watched for the responsible development of AI.

Nuclear “security”. That was in fact one of the reasons for the differences between Sam Altman and many of those who abandoned Openai. The clearest example is Ilya Sutskever, which after its march has created a startup with a very descriptive name: Safe Superintelligence (SSI). The objective of said company, said its founder, is that of create a “nuclear” security superintelligence. His approach is therefore similar to that pursued by Anthropic.

In Xataka | Agents are the great promise of AI. They also aim to become the new favorite weapon of cybercounts

Anthropic biological blackmailing capable Claude creating future raises weapons worried

What do you think?

0 Points

Upvote Downvote

Claude 4 raises a future of the capable of blackmailing and creating biological weapons. Even Anthropic is worried

What do you think?

There are too many AI models. That raises a true death sentence for Anthropic and Claude

Anthropic launches Claude 3.7 Sonnet, a “hybrid” model that is better than ever. Not only that: also “reason”

The commercial war between China and the US also goes from airplanes. The c919 comac already threatens the future of Boeing and Airbus

For some reason, evolution does not stop creating antiques. 12 times different around the world has emerged

For some reason, evolution does not stop creating antiques. 12 times different around the world has emerged

Russia is beginning to run out of the weapons inherited from the USSR. So he is pulling those of North Korea

There are people so obsessed with getting brunette in summer who are turning to something

How to create a Power Point presentation with artificial intelligence

It is very likely that charging bar is more false than a wooden euro

The Captcha had become an excellent tool to fight the bots. Until Chatgpt Agent arrived

The five best boundary offers of El Corte Inglés in Technology during the first weekend of August

It cannot be allowed penalizing with Asml tariffs

Leave a ReplyCancel reply

China already has an army of drones and firefighters. And fight the fire to cannon

The diamond industry promised them happy with the jewels cultivated in the laboratory. Until prices sank

The LowCost airline business is in the accessory. That is why this idea of vertical seats is one of his old dreams

Jordi Wild has the most controversial podcast in Spain. And precisely that is what waves have rewarded

China’s boom in the world of technology, visit to the headquarters of Byd in Shenzhen and much more in 1×14 crossover

The transforming potential of the AI is right in what we do not see

How to put your name on the iPhone screen when it is blocked to know who it is

The LowCost airline business is in the accessory. That is why this idea of vertical seats is one of his old dreams

What do you think?

Leave a ReplyCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections

Hey Friend!Before You Go…

Hey Friend!
Before You Go…