Generative AI is moving quickly into video creation. In September 2025 OpenAI announced Sora 2 — a major leap in text‑to‑video models. According to OpenAI’s research blog, Sora 2 is more physically accurate and realistic than earlier systems, offers synchronized dialogue and sound effects, and can follow intricate multi‑shot instructions while preserving world state. This upgrade positions Sora 2 as a new benchmark in AI video generation.
Google’s Veo 3 has been the primary competitor. A DeepMind overview states that Veo 3 brings built‑in audio generation, enabling sound effects, ambient noise and dialogue, while delivering best‑in‑class quality, realism and prompt adherence. In September 2025 Google updated Veo 3 and the cheaper Veo 3 Fast with the ability to generate 1080p videos and vertical 9:16 format, while slashing the price from $0.75/second to $0.40/second and from $0.40/second to $0.15/second respectively. These rapid improvements fuel speculation about the next incremental release — Veo 3.1.
Google has not officially announced Veo 3.1, but evidence has showed a there's a “Veo 3.1 waitlist.” Veo 3.1 will be an incremental update to Veo 3/Fast that extends video length and improves control. Leaked menus reportedly show options for 480 p, 720 p and 1080 p output as well as longer 30‑second clips and features like product placement and banana placement (a whimsical test for accurate object positioning).
Google’s Veo series demonstrates rapid iteration in video generative models:
The following are expectations drawn from leak analyses and community speculation:
Expected advance | Evidence/rumor | Why it matters |
---|---|---|
Longer clips (up to 30 s) | Rumored waitlist screens show 30 s duration settings, extending the current 8–10 s cap in Veo 3. | Longer duration allows narrative sequences and multi‑shot stories without stitching clips. |
Higher resolution & improved aspect ratios | Leaked settings mention 480 p, 720 p and 1080 p outputs by default; this would align with the September 2025 update that added 1080 p to Veo 3/Fast. | Native 1080 p would make Veo competitive with Sora 2, which already produces high‑fidelity videos. |
Enhanced physics & object interactions | Veo 3 already excels at physics and prompt adherence; leaks suggest 3.1 will further refine motion realism, enabling subtle interactions like product placement or balancing objects (the “banana placement” challenge). | Stronger physics simulation would close the gap with Sora 2’s accurate modeling of buoyancy and inertia. |
Product placement & character consistency tools | Higgsfield demos reportedly show UI options for inserting branded objects and maintaining consistent characters across shots. | Allows marketers and storytellers to embed products or maintain consistent actors throughout a video. |
Better multi‑prompt and editing controls | Speculation includes multi‑prompt support and editing tools akin to Google’s AI Studio MediaSim app (demoed for Veo 3). | Improves creative control, enabling multi‑scene composition, transitions and style changes within one generation. |
Below is a high‑level comparison of the three leading AI video models, based on available sources.
Model | Core strengths (according to public sources) | Resolution & duration | Audio support | Control & creative features |
---|---|---|---|---|
Veo 3 / rumored 3.1 | Built‑in audio generation (sound effects, ambient noise, dialogue) and strong physics & realism. Rumored 3.1 would extend clip length, provide 1080 p/vertical outputs and add product/banana placement features. | Veo 3 generates up to about 10 s; API now supports 1080 p in 16:9 and vertical 9:16 formats. Rumored 3.1 may offer 30‑s clips and uniform 1080 p. | Native audio generation (dialogue and soundscapes). | Prompt adherence and physics; reference images for style guidance; rumored product placement and multi‑prompt editing tools. |
Sora 2 (OpenAI) | More physically accurate and realistic than prior models; can handle complex physics and follow multi‑shot instructions. Allows insertion of real people into scenes. | Generates high‑quality videos; durations up to minute‑long in the Sora app (according to Sora marketing). | Synchronized dialogue and sound effects; can create background soundscapes and speech. | High controllability; can persist world state across shots; includes “cameo” feature to insert user likeness into videos. Access currently via Sora app and limited invitations. |
Seedance 1.0 Pro (ByteDance) | Focuses on smooth, stable motion and native multi‑shot storytelling; ideal for cinematic or social media animations. Offers diverse stylistic outputs. | Supports multiple resolutions and aspect ratios; videos are typically short (often ~10 s). Does not currently support integrated audio. | Does not generate audio. | Multi‑shot storytelling and style reference; designed for animators and content creators seeking fluid animation. |
Google has not released an official timeline for Veo 3.1. Observers note that the Gemini 3 and Veo 3.1 rumors coincide with Google’s October 9 2025 event, suggesting a potential announcement. In the absence of official confirmation, it is safest to assume that Veo 3.1 will roll out initially through Google’s preview channels — Gemini API, AI Studio and Flow — as Veo 3 did, followed later by general availability.
The DeepMind site lists several portals for trying Veo. Users can access Veo via Flow, the Gemini app, Google AI Studio, the Gemini API, Google Vids and DeeVid AI. These channels currently offer Veo 3 and Veo 3 Fast; a future Veo 3.1 preview would likely appear in the same interfaces. Early testers should watch for announcements on the Google AI Developers X account.
Q: Is Veo 3.1 officially released?
A: As of October 8 2025 there is no official release. It is highly possible that Google will release it on Google’s October 9 2025 event.
Q: Will Veo 3.1 support longer videos?
A: Multiple leaks indicate a 30‑second option, but this has not been confirmed. Current Veo 3 clips are limited to around 10 seconds via the API.
Q: Does Veo 3.1 generate audio?
A: Veo 3 already generates sound effects, ambient noise and dialogue. Rumored updates will likely retain or improve this capability.
Q: How does Veo compare with Sora 2 and Seedance?
A: Veo focuses on realistic physics and prompt adherence with integrated audio; Sora 2 emphasizes high‑fidelity physical simulation and controllability; Seedance excels at smooth motion and multi‑shot storytelling but lacks audio.
Q: Where can I follow news on Veo 3.1?
A: Monitor the Google AI Developers blog and social media channels, the DeepMind Veo product page, and creative platforms like DeeVid AI. These sources often post updates and will host any official preview of Veo 3.1.