The Google Deepmind team has announced its new AI model to generate interactive worlds. At the end of last year We were surprised with what Genie could do 2 And the new version is an important leap, one that for Google is an advance in the creation of the General Artificial Intelligence or AGIthat which can match the abilities of the best human.
Genie 3. It is the new World Model o Deepmind world model. Allows to create interactive worlds for which we can explore, all from a Prompt of text. The previous model was very limited and could only be used for a few seconds, but with Genie 3 Deepmind promises that it can be explored for “several minutes.” In addition, the resolution has improved at 720p to 24FPS. The model is based on Genie 2 and I see 3.
It has memory. It is the most important improvement of the new model. The world is generated through ia as we explore it, but if we turn around and look at something we had already seen, it remains the same. We can also change something, such as painting on a wall, and that is kept as we leave it all the time. This did not happen in previous versions and its creators say they did not explicitly schedule it to do that. As explained in an article in TechcrunchGenie 3 is able to remember what he has already generated to train himself, in this way he learns how the world and his physical works.
Interactive. It also emphasizes that events can be added with Prompts additional In his article, Deepmind puts several interactive examples such as a meadow in which we can choose if a tractor, a bear, a horse or hot air balloons will appear. They call it “promptable World Events” and also allows you to change aspects such as weather.
Why is it important. Worlds models are useful in different scenarios such as the creation of scenarios for real -time games, in education or in the training of AI agents. Google points to it in its blog as a key step to reach the AGI, that upper artificial intelligence that so many companies are trying to get as soon as possible. These worlds can be used as a training field for other AI, also including robots, cases in which simulating real scenarios is a challenge.
In the presentation, the Deepmind team explained how they put an agent in a stage that simulated a warehouse and asked him to approach certain elements, such as a green garbage cube. In all the tests he achieved, according to the Deepmind team “the fact that (the agent) is able to achieve this is because Genie 3 remains coherent.”
The competition. The largest IA competition, at least at the level of products for the end user, we see it in the chatbots and, to a lesser extent, in video or audio generators. The world models are less popular among the public and there is not a great competition. Nvidia presented cosmos at the beginning of the year And there are some companies like World Labs They offer similar proposals. We would like to finish this text with a link so you can try it, but Genie 3 is only available in Beta for a very limited group of academics.
Image | Deepmind
GIPHY App Key not set. Please check settings