Google’s Gemini Omni Model Creates Video From Text, Images, and Audio

Edit Videos by Just Talking to Them
Google's Gemini Omni Model Creates Video From Text, Images, and Audio 4
Google

Editorial Note: Talk Android may contain affiliate links on some articles. If you make a purchase through these links, we will earn a commission at no extra cost to you. Learn more.

Last year's Nano Banana brought Gemini to image generation. Now Google is going a step further. Gemini Omni is the company's new model family that brings reasoning and creation together, with the first output modality being video — and it's available today.

The starting model in the family, Gemini Omni Flash, is rolling out now to Gemini app subscribers.

What Omni Can Do

YouTube video

The core idea is that you can feed Omni any combination of inputs — images, video, audio, text, or all four — and get a high-quality video out. What sets it apart from basic generation tools is that edits are conversational and cumulative. Each instruction builds on the last, so characters stay consistent, the physics hold up, and the scene remembers what you've already done.

Practically, that means you can take a video you shot, describe a change, and have Omni rewrite the action. You could ask it to make a sculpture look like it's made of bubbles, add lights that sync to music, or shift the camera to an over-the-shoulder angle after multiple prior edits — all without losing the thread of the original scene.

World Knowledge Meets Physics

Google

Omni isn't just pattern-matching on visual styles. Because it sits on top of Gemini's underlying model, it can draw on the model's world knowledge and apply it to what it's creating. That means more accurate physics in generated footage, the ability to create complex visual explainers from short prompts, and the capacity to blend cultural or scientific context meaningfully into scenes.

The model also accepts style references and motion references from input videos. You can direct it to apply the swimming motion of a whale to a different character or surface, or match a visual style from one reference clip while using motion data from another.

Availability and Responsibility

Google

Gemini Omni Flash is live today for Google AI Plus, Pro, and Ultra subscribers globally via the Gemini app and Google Flow. It's also available at no cost to YouTube Shorts and YouTube Create users starting this week. Developer and enterprise API access is coming in the weeks ahead.

All videos created with Omni include an imperceptible SynthID watermark, and you can verify AI-generated content through the Gemini app, Gemini in Chrome, and Google Search. Avatar creation — which generates a digital version of your own likeness for video — is supported, though Google says it's still working through how to roll out voice-altering capabilities responsibly.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
Google Launches Universal Cart With AI Price Tracking and Checkout 5

Google Launches Universal Cart With AI Price Tracking and Checkout