the reason is the laws of physics
Surely you already know (online advertising is reminding you day in and day out) that with a simple prompt you can generate a video game. The AI does it for you, but what it can’t do is play it. The reason is not that games are difficult in the abstract: it is that the real world obeys the same physical laws everywhere, and video games do not. Do, not play. The paradox is striking: with tools like Cursor either Claudea prompt generates a clone of a functional classic game. ‘Asteroids’, for example. However, that same system would not even surpass the first level of its own creation. Julian Togelius, director of the Game Innovation Lab at New York University and co-founder of the testing company Modl.ai, has been investigating why for months, and has broken it down in an interview. Programming is not a game. Togelius defines programming from a structural point of view: a very well designed game. Each line of code comes with a clear statement, a verifiable success criterion and feedback on possible failures, and the program indicates exactly where and why it failed. LLMs (language models) have been trained with massive amounts of code and fine-tuned using reinforcement learning to solve exactly those types of problems. Programming is, in terms of task structure, an exceptionally “well-behaved” game, as Togelius defines it. That’s why so many people find programming fun. However, video games are another story: the action space is governed by more arbitrary rules, feedback can be immediate or take hours to arrive, spatial reasoning is essential and the margin of error is much smaller. When an AI model is asked to play something, the result documented in the paper that Togelius made is unequivocal: “absolute failure.” With a guide, please. Gemini 2.5 Pro completed ‘Pokémon Blue’ in May 2025, but it took considerably longer than any human player, made repetitive mistakes, and relied on auxiliary software to achieve it. The TIME magazine analyzed Why the best AI systems still struggle with ‘Pokémon’. And that is one of the few titles that manage to finish. They achieve this because these systems have specific APIs to consult strategic guides. That ‘Pokémon’ or ‘Minecraft’ (another title that AIs can navigate) are two of the most documented franchises in the history of video games, with millions of hours of walkthroughs available on the internet, is the key to making it easier for them. The key is in physics. But… why can a language model write an essay on quantum physics and at the same time fail in both ‘Halo’ and ‘Space Invaders’? Togelius’s response is that “those two games are more different from each other, in a sense, than two different academic essays.” Looked at another way: video games are very heterogeneous. Each one invents their own rules, their own space logic, their own reward system. The mechanics of a platform game are absolutely different from those of a ‘Tetris’. Spatial reasoning (where objects are, how they move, how they relate) does not appear in the pre-training data of the language models because it cannot be understood from one game to the next. However, let’s look at a task seemingly more difficult than playing ‘Super Mario’: driving a self-driving car. And AIs do that well. The difference with games is that the real world obeys the same physical laws anywhere on the planet. The asphalt behaves the same in San Francisco as in Shanghai, the traffic lights follow the same principles, the vehicle always responds the same. As Togelius points out, “driving is much more homogeneous than video games as a whole.” Learn to drive and you can do it anywhere on the planet. Learn how to play ‘Doom’ and you will have no idea how to play ‘Age of Empires’. The definitive criterion. That is why Togelius proposes video games as a criterion to determine the success of an AI: it is necessary to gauge whether an agent capable of learning can complete any game in the top 100 on Steam in approximately the same time as a skilled human player, without access to prior documentation or specific integration. To that scale (which does not require winning on the first try, but rather learning at a human pace) there is no system today that comes close. Header | Photo of Erik Mclean in Unsplash In Xataka | AI entered video games as an experiment. Today more than 80% of developers no longer know how to produce without it