Google Unveils Veo 3.1 to Rival OpenAI’s Sora 2 with Realistic AI Videos and Sound
The war of artificial intelligences reaches a peak. With every announcement, a new model emerges, more daring, more immersive, more… expensive. In this battle of innovations, Google did not want to remain a spectator. By releasing Veo 3.1, it unveils a video AI armed with sounds, dialogues, and new editing capabilities. Facing the viral popularity of Sora 2, the Mountain View firm plays another card: that of narrative precision and creative control.
In brief
- Veo 3.1 integrates audio, dialogues and sound effects to enrich the AI-generated scenes.
- The tool targets serious creators, with editing options and professional formats.
- Three key modules: image composition, creative transitions, and smooth clip extension.
- Google’s AI favors visual coherence, sometimes at the expense of action speed.
Technological Duel: Google attacks the queens of AI video
When OpenAI, valued at $500 billion without IPO , launched Sora 2 on September 30, the success was immediate. The app was downloaded more than one million times in only five days, climbing to the top of the App Store. Its approach? A “TikTok-ized” interface, designed for sharing and remixing.
Google did not choose this path. With Veo 3.1 , the goal is clear: to address creators, not influencers. The model allows generating videos with 1080p resolution, in horizontal or vertical format, integrating sound atmosphere, synchronized voices, and realistic effects. Accessible via Flow, Vertex AI and Gemini API, it offers two plans: a fast version at $0.15/second, and a standard one at $0.40/second.
The firm emphasizes the audio capabilities, now present in all modules. It promises an unprecedented rendering: the lip synchronization of Veo 3.1 surpasses that of all other models.
Where Sora favors visual dynamism, Veo chooses coherence. Movements are slower, but elements remain stable. It is the price of precision. A positioning that contrasts with the ambitions of Meta or Luma Labs, who focus more on speed and the wow effect.
Stories that speak: Google’s AI wants to tell
One of Veo 3.1’s major bets is narrative immersion. The addition of sound allows Google to take a step forward: no longer just illustrating, but telling with images and voices. Three features stand out:
- Ingredients to Video: you combine several reference images, and the AI generates a scene with objects and characters;
- Frames to Video: you provide a starting image and an ending one, and the AI produces a coherent transition;
- Extend: the AI extends a clip by generating the continuation from the last second.
The tool also allows adding or removing elements, taking shadows and lights into account. This level of detail is the strength of the approach: a film studio within an artificial intelligence interface.
But not everything is perfect. When instructions stray too far from visual logic, the AI goes off track. Some scenes jump from one shot to another, lose characters or completely change atmosphere. It remains a technology under development.
As Google explained in its official blog:
We’re also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures.
Veo 3.1 does not want to entertain: it wants to move. And this is probably where it differs radically from its competitors.
Demanding UX, stunning result: when artificial intelligence becomes a creative tool
The user experience provided by Veo 3.1 is not that of a social network. It is not a product to consume, but a tool to master. Creators must learn to speak the language of AI. A poorly written prompt or one too far from reference images can produce an incoherent result.
Some tips are already circulating among users. For example, going through Seedream to generate a faithful initial image before importing it into Veo. Or using an audio-aware construction, explicitly mentioning the desired sounds in prompts.
In this regard, here are some concrete facts:
- Veo has generated more than 275 million videos since the launch of Flow;
- Three creative modules are available: Ingredients, Frames, Extend;
- The usage cost is up to 2 times lower than that of Sora 2 Pro;
- Videos can last up to one minute, with integrated sound;
- Only three models handle spoken voices: Sora, Grok, and now Veo.
The tool is not easily tamed. But once understood, it delivers videos of rare realism, with accurate intonations and credible characters. It just requires patience, skill… and some credits.
Google no longer hides its ambition to dominate generative AI. Veo 3.1 shows that the firm does not just want to follow. It wants to impose its tempo. And to confirm this thirst for achievement, one of its robots has just solved a math problem considered impossible . The message is clear: the AI giant is just starting to speak.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Ethereum Updates Today: Buterin Moves ETH to Safeguard Privacy Against Major Financial Players and Quantum Threats
- Ethereum co-founder Vitalik Buterin donated 128 ETH ($760,000) to privacy-focused apps Session and SimpleX Chat, emphasizing decentralized metadata protection and user-friendly access. - Recent 1,009 ETH transfer to Railgun protocol sparked speculation about asset reallocation, though control remains with Buterin amid mixed Ethereum price trends. - Buterin warns of existential risks: 10.4% institutional Ether ownership and quantum computing threats by 2028, advocating layered security for Ethereum's desi

The Psychological Factors Influencing Retail Investors’ Actions in Cryptocurrency Markets
- Crypto markets are shaped by behavioral finance, where retail investors drive volatility through FOMO, herd behavior, and overconfidence. - The PENGU token exemplifies this dynamic, surging 480% in July 2025 but plummeting 28.5% by October due to emotional trading cycles. - Social media amplifies emotional contagion, with traders checking prices 14.5 times daily, while financial literacy mitigates bias susceptibility. - Personality traits like neuroticism increase cognitive biases, and speculative narrat

Bitcoin News Today: Bitcoin's Unstable Holiday Periods Hide Average Gains of 6%
- Bitcoin's Thanksgiving-to-Christmas performance shows equal odds of rising or falling, with a 6% average seasonal return despite volatility. - Historical extremes include a 50% 2020 rally and 2022's 3.62% drop post-FTX collapse, amid a $2.49-to-$91,600 long-term surge since 2011. - 2025's $91,600 price reflects ongoing recovery from 2024's $95,531 peak, with institutional crypto adoption and macroeconomic factors shaping future trajectories. - Analysts advise dollar-cost averaging for retail investors, w

Australia Strikes a Balance Between Fostering Crypto Innovation and Safeguarding Investors with Updated Regulations
- Australia introduces 2025 Digital Assets Framework Bill to regulate crypto platforms under ASIC, creating "digital asset platform" and "tokenized custody platform" licenses. - The framework mandates custody standards, transparency requirements, and lighter regulations for small operators (<$5k per customer) to balance innovation with investor protection. - Global alignment with UAE and EU crypto regulations is emphasized, while addressing risks from past failures like FTX through stricter enforcement and

