usatoday24

Creating an image with AI is no longer as surprising as before. What begins to make a difference is the ability to modify it, give it continuity and turn an initial idea into something more elaborate without losing the thread along the way. In video, that challenge is much greater: there is movement, time, physics, and characters that must continue to appear coherent. Gemini Omni comes with the promise of addressing this problem and making editing a much easier task.

Google DeepMind itself asks to think of Gemini Omni as in Nano Bananabut for video. The reference makes sense because Nano Banana was Google’s image generator that took visual creation with AI to a very striking scale. The first version, released in August 2025, added 13 million users in four days and had generated more than 5 billion images by mid-October.

Google now introduces Gemini Omni Flash as the first model in the Gemini Omni family. According to the company, it is designed to create content from any entry. The idea is that the user can combine images, audio, video and text as a starting point to generate high-quality videos supported by Gemini’s real-world knowledge.

A video generation model that is committed to coherence

The most interesting part is how Google describes the editing process. It is not only proposed as a tool to generate a clip from scratch, but as a system capable of working on a scene using chained instructions. The company talks about changing specific elements or completely transforming a starting video, adjusting aesthetics, action, environment, angle, style or specific details. It also promises to maintain character consistency, preserve scene continuity, and offer more coherent physics.

In his note, he shows how Gemini Omni can start from a scene and modify it with direct instruction, whether to change the material of an object, alter an action, or turn a complex idea into a visual explanation. Let’s look at some examples of prompts.

“Make the sculpture out of bubbles”
“When the person touches the mirror, make the mirror ripple beautifully like liquid, and the person’s arm turns into reflective mirror material”
“Claymation explainer of protein folding, everything is made out of clay, no hands, stop motion, accurate”

At Xataka we have done orna first test with a recognizable image: Puerta de Alcalá, in Madrid. The starting point was a static photograph and the prompt we used was the following:

“Create a video from this image. Cars are moving forward and people are walking.” (Create a video from this image. Cars move forward and people walk.)

The idea was to see to what extent Gemini Omni could turn a real scene into a small moving clip. In the video above you can see precisely that attempt to animate the original imagewith cars moving forward, pedestrians walking, and ambient sound that fits the scene. It also appears to retain some visible branding elements on the vehicles, especially the Mercedes-Benz logo, although in other cases, such as Fiat, the result is less clear.

Let’s talk about availability. Google ensures that Gemini Omni Flash begins to reach Google AI Plus, Pro and Ultra subscribers through Gemini and Google Flow, while its deployment at no cost in YouTube Shorts and YouTube Create App launches this week.

Welcome to the AI duopoly: the sector already has a turnover of 80 billion a year, but OpenAI and Anthropic take 89% of the revenue

In our test with a corporate account, however, we found ourselves with a fairly tight limit: after generating three videos, the system warned us that “we had reached our video generation limit until May 20 at 7:59 p.m.” It is not too surprising if we think about what is happening below: creating video with AI requires a lot of resources, so everything indicates that Google would be dosing access, at least in this first phase.

When we talk about video generation with artificial intelligence, it is likely that one of the first names that comes to mind is sora. It arrived like one of the great promises of OpenAI for this terrain. The route, however, ended up being much shorter than that initial ambition suggested. Its website and app were no longer available at the end of April 2026.although the API will continue to work until September 24.

Images | Google | Xataka

In Xataka | There is a battle to have the AI model that programs best. And a good, pretty and very cheap rival has appeared in it: Cursor

Leave your vote

0 Points

Upvote Downvote

Gemini Omni wants to do with video what Nano Banana did with images: Google is aiming very high

A video generation model that is committed to coherence

Leave your vote

Leave a CommentCancel reply

A video generation model that is committed to coherence

Leave your vote

Leave a CommentCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections