
Veo 3.1 Preview: What Could We Expect More in the Rapid Evolution of AI Video
Intro: Sora 2 and Veo 3
Generative AI is moving quickly into video creation. In September 2025 OpenAI announced Sora 2 — a major leap in text‑to‑video models. According to OpenAI’s research blog, Sora 2 is more physically accurate and realistic than earlier systems, offers synchronized dialogue and sound effects, and can follow intricate multi‑shot instructions while preserving world state. This upgrade positions Sora 2 as a new benchmark in AI video generation.
Google’s Veo 3 has been the primary competitor. A DeepMind overview states that Veo 3 brings built‑in audio generation, enabling sound effects, ambient noise and dialogue, while delivering best‑in‑class quality, realism and prompt adherence. In September 2025 Google updated Veo 3 and the cheaper Veo 3 Fast with the ability to generate 1080p videos and vertical 9:16 format, while slashing the price from $0.75/second to $0.40/second and from $0.40/second to $0.15/second respectively. These rapid improvements fuel speculation about the next incremental release — Veo 3.1.
What is Veo 3.1?
Google has not officially announced Veo 3.1, but evidence has showed a there's a “Veo 3.1 waitlist.” Veo 3.1 will be an incremental update to Veo 3/Fast that extends video length and improves control. Leaked menus reportedly show options for 480 p, 720 p and 1080 p output as well as longer 30‑second clips and features like product placement and banana placement (a whimsical test for accurate object positioning).
Google Veo model series
Google’s Veo series demonstrates rapid iteration in video generative models:
- Veo 1 (early 2024) – the initial system produced silent clips with moderate resolution. It established the basic architecture for text‑to‑video generation.
- Veo 2 (mid‑2024) – introduced creative controls. DeepMind’s model page notes that Veo 2 gained “reference powered video” and “match your style” features, allowing users to provide reference images to guide video generation and style matching. These controls improved consistency and creative intent.
- Veo 3 (May 2025) – re‑designed for greater realism. The DeepMind page describes Veo 3 as adding native audio generation (sound effects, ambient noise and dialogue) and improving prompt adherence. The update also claims 4K output and better physics simulation, though the Gemini API initially restricted outputs to 720 p.
- Veo 3 Fast (July 2025) – an optimized version offering lower cost and slightly reduced quality. Both Veo 3 and Veo 3 Fast were updated in September 2025 to support 1080 p video and vertical 9:16 aspect ratio.
- Veo 3.1 (Unknown) is expected to build on these improvements by providing longer video durations, higher resolution in API, and finer control over scenes and objects.
Main advances & differences of Veo 3.1
The following are expectations drawn from leak analyses and community speculation:
Expected advance | Evidence/rumor | Why it matters |
---|---|---|
Longer clips (up to 30 s) | Rumored waitlist screens show 30 s duration settings, extending the current 8–10 s cap in Veo 3. | Longer duration allows narrative sequences and multi‑shot stories without stitching clips. |
Higher resolution & improved aspect ratios | Leaked settings mention 480 p, 720 p and 1080 p outputs by default; this would align with the September 2025 update that added 1080 p to Veo 3/Fast. | Native 1080 p would make Veo competitive with Sora 2, which already produces high‑fidelity videos. |
Enhanced physics & object interactions | Veo 3 already excels at physics and prompt adherence; leaks suggest 3.1 will further refine motion realism, enabling subtle interactions like product placement or balancing objects (the “banana placement” challenge). | Stronger physics simulation would close the gap with Sora 2’s accurate modeling of buoyancy and inertia. |
Product placement & character consistency tools | Higgsfield demos reportedly show UI options for inserting branded objects and maintaining consistent characters across shots. | Allows marketers and storytellers to embed products or maintain consistent actors throughout a video. |
Better multi‑prompt and editing controls | Speculation includes multi‑prompt support and editing tools akin to Google’s AI Studio MediaSim app (demoed for Veo 3). | Improves creative control, enabling multi‑scene composition, transitions and style changes within one generation. |
How does Veo 3.1 perform? Potential improvements in tests
- Realism and audio – Veo 3 already generates audio and realistically simulates physics. If 3.1 adds longer clips, we can expect greater temporal coherence and more varied soundscapes.
- Prompt adherence – Google claims Veo 3 follows prompts “like never before”; 3.1 could improve multi‑shot prompt following, aligning with Sora 2’s ability to persist world state across complex instructions.
- Resolution – September 2025 updates allow 1080 p output via the API, so 3.1 may enable this resolution for all aspect ratios and possibly 4K for premium tiers. The Mindstream newsletter noted that 1080 p is currently only available for 16:9 clips, while vertical videos remain lower resolution; a 3.1 update could unify these.
- Durations – If 3.1 supports 30‑s videos, we should see improved motion continuity and narrative flow. However, longer sequences could also amplify artifacts if the model fails to maintain consistency.
Veo 3.1 vs Sora 2 vs Seedance 1.0 Pro
Below is a high‑level comparison of the three leading AI video models, based on available sources.
Model | Core strengths (according to public sources) | Resolution & duration | Audio support | Control & creative features |
---|---|---|---|---|
Veo 3 / rumored 3.1 | Built‑in audio generation (sound effects, ambient noise, dialogue) and strong physics & realism. Rumored 3.1 would extend clip length, provide 1080 p/vertical outputs and add product/banana placement features. | Veo 3 generates up to about 10 s; API now supports 1080 p in 16:9 and vertical 9:16 formats. Rumored 3.1 may offer 30‑s clips and uniform 1080 p. | Native audio generation (dialogue and soundscapes). | Prompt adherence and physics; reference images for style guidance; rumored product placement and multi‑prompt editing tools. |
Sora 2 (OpenAI) | More physically accurate and realistic than prior models; can handle complex physics and follow multi‑shot instructions. Allows insertion of real people into scenes. | Generates high‑quality videos; durations up to minute‑long in the Sora app (according to Sora marketing). | Synchronized dialogue and sound effects; can create background soundscapes and speech. | High controllability; can persist world state across shots; includes “cameo” feature to insert user likeness into videos. Access currently via Sora app and limited invitations. |
Seedance 1.0 Pro (ByteDance) | Focuses on smooth, stable motion and native multi‑shot storytelling; ideal for cinematic or social media animations. Offers diverse stylistic outputs. | Supports multiple resolutions and aspect ratios; videos are typically short (often ~10 s). Does not currently support integrated audio. | Does not generate audio. | Multi‑shot storytelling and style reference; designed for animators and content creators seeking fluid animation. |
When will Veo 3.1 and the Veo 3.1 API be online?
Google has not released an official timeline for Veo 3.1. Observers note that the Gemini 3 and Veo 3.1 rumors coincide with Google’s October 9 2025 event, suggesting a potential announcement. In the absence of official confirmation, it is safest to assume that Veo 3.1 will roll out initially through Google’s preview channels — Gemini API, AI Studio and Flow — as Veo 3 did, followed later by general availability.
Where can we try Veo 3 (and likely Veo 3.1) first?
The DeepMind site lists several portals for trying Veo. Users can access Veo via Flow, the Gemini app, Google AI Studio, the Gemini API, Google Vids and DeeVid AI. These channels currently offer Veo 3 and Veo 3 Fast; a future Veo 3.1 preview would likely appear in the same interfaces. Early testers should watch for announcements on the Google AI Developers X account.
FAQ
Q: Is Veo 3.1 officially released?
A: As of October 8 2025 there is no official release. It is highly possible that Google will release it on Google’s October 9 2025 event.
Q: Will Veo 3.1 support longer videos?
A: Multiple leaks indicate a 30‑second option, but this has not been confirmed. Current Veo 3 clips are limited to around 10 seconds via the API.
Q: Does Veo 3.1 generate audio?
A: Veo 3 already generates sound effects, ambient noise and dialogue. Rumored updates will likely retain or improve this capability.
Q: How does Veo compare with Sora 2 and Seedance?
A: Veo focuses on realistic physics and prompt adherence with integrated audio; Sora 2 emphasizes high‑fidelity physical simulation and controllability; Seedance excels at smooth motion and multi‑shot storytelling but lacks audio.
Q: Where can I follow news on Veo 3.1?
A: Monitor the Google AI Developers blog and social media channels, the DeepMind Veo product page, and creative platforms like DeeVid AI. These sources often post updates and will host any official preview of Veo 3.1.