Google Launches Gemini Omni Flash, Its First Conversational Video Editing Model

Google DeepMind on June 30 released two new generative AI models designed to make image and video creation faster and cheaper for developers: Gemini Omni Flash for video generation and Nano Banana 2 Lite for rapid image output.

Conversational video editing

Gemini Omni Flash (gemini-omni-flash-preview) is a multimodal model that generates videos up to 10 seconds long at 720p from text descriptions, still images, or existing video clips. Its headline feature is conversational editing: instead of cutting a timeline, users describe changes in plain language and the model applies them, maintaining text-action synchronization and drawing on real-world context to render scenes accurately.

The model is priced at $0.10 per second of video output and is available now through Google AI Studio, the Gemini API, the Gemini app, and Google Flow. Google notes current limitations — no audio-reference support, a 10-second length cap, and uneven character consistency across scene changes — with longer videos and scene extension coming in future updates.

Fast, cheap image generation

Nano Banana 2 Lite (gemini-3.1-flash-lite-image) is now generally available and positions itself as Google’s fastest and most cost-efficient image model. It produces an image in roughly four seconds at a cost of $0.034 per 1,000 images, while retaining strong prompt adherence, character consistency, and legible in-image text rendering according to Google.

The model is live on Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform. A consumer rollout to AI Mode in Search, Google Photos, NotebookLM, and Google Ads is underway.

Product managers Alisa Fortin and Anish Nangia described the two models as complementary: developers can chain rapid image generation with Omni Flash’s video animation to build end-to-end multimedia pipelines.