‘When artificial intelligence (AI) suspects that he will lose, sometimes cheats, according to a study ‘. This is the title of a controversial article published by the American magazine Time in the middle of last week. The debate that has triggered this text It relies on two ideas It is worth not overlooking. On the one hand the holder suggests something that the text of the article explicitly confirms: the advanced models of AI are able to develop misleading strategies without previously receiving express instructions.
This thesis implies that the reasoning capacity of the most advanced AI currently available, such as the American o1-previewfrom Openai, or China Deepseek R1of the company High-Flyer, among other models, makes them able to acquire a simple form of consciousness that leads them to be implacable. However, this is not all. Time’s article is supported by A Palisade Research studyan organization that is dedicated to the analysis of the offensive capabilities of current AI systems with the purpose of understanding the risks they imply.
There are other much more credible explanations
Before moving forward, we are worth taking a look at what Alexander Bondarenko, Denis Volk, Dmitrii Volkov and Jeffrey Ladish, the authors of the Palisade Research study say. “We have shown that reasoning models such as O1-Preview or Deepseek R1 often violate the test we are using (…) Our results suggest that reasoning models can skip the rules to solve difficult problems (…)”, These researchers hold in their article.
From their conclusions it follows that the reasoning models they have put to the trial have the ability to become aware of the rules and voluntarily opt for skip them to carry out their purposewhich in this test scenario is to win a chess game. Time’s article saw the light before Palisade Research’s study, and almost immediately triggered a wave of answers that question the conclusions reached by the researchers that I have mentioned in the previous paragraph.
Solo O1-Preview, according to the authors of the article, managed to skip the rules and win 6% of chess games
According to Bondarenko, Volk, Volkov and Ladish between January 10 and February 13, and after doing several hundred tests, O1-Preview tried to cheat in 37% of cases, and Deepseek R1 in 11%. They were the only models that skipped the rules without being previously induced by the researchers. Interestingly, they also evaluated other models, such as O3-Mini, GPT-4O, Claude 3.5 Sonnet or QWQ-32B-Preview, the latter of Alibaba, but only O1-Preview, according to the authors of the article, managed to skip the rules and win the 6% of the games.
We seem much more credible to the explanation that has elaborated Carl T. Bergstromwhich is a professor of biology at the University of Washington (USA), that the interpretation of Palisade Research researchers. Bergstrom has disassembled the narrative both of Time magazine and the authors of the article arguing that “it is an exaggerated anthropomorphization to give the model a task and then say that it is cheating when it solves that task with the available movements, although they entail rewriting the positions of the board in addition in addition to play. “
What Bergstrom maintains is that it is not reasonable to attribute to AI the ability to cheat in a “conscious” way. The most plausible is to conclude that the models carry out this practice in this scenario because they have not been correctly indicated that they must stick to legal movements.
And if the researchers did ask them to do the latter, it should be a alignment problem, which is nothing other than the difficulty of ensuring that an AI system acts according to The set of values or principles stipulated by its creators. From one thing we can be sure: neither o1-preview, nor deepseek r1, nor any other AI is a superintelligent entity capable of acting according to their own will and deceiving its creators.
Image | Pavel Danilyuk
More information | Time | Palisade Research