Google AI: Release Notes

How a Moonshot Led to Google DeepMind's Veo 3

36 snips
Oct 16, 2025
Dumi Erhan, co-lead of the Veo project at Google DeepMind, shares his extensive expertise in video-generation research. He delves into the fascinating journey of the Veo project, from its moonshot beginnings to the groundbreaking Veo 3 model with audio capabilities. Dumi discusses the challenges of long-duration video coherence and the impact of user feedback on future developments. He also explores the complexity of image-to-video generation and highlights innovative prompting methods that enhance user control.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Origins In A 2018 Moonshot

  • The Veo project began as a Google Brain moonshot in 2018 aimed at pushing video generation boundaries.
  • Early work focused on video prediction and robotics use-cases, which shaped long-term research directions.
INSIGHT

Evaluation And Inductive-Bias Gaps

  • Despite huge quality gains since 2018, core problems like evaluation and correct inductive biases remain unsolved.
  • Video lacks the clear tokenization of text, making progress and measurement harder than with LLMs.
ADVICE

Combine Metrics With Careful Human Eval

  • Use automated metrics to rule out failing models, but rely on human evaluation for preferences and user-facing quality.
  • Avoid optimizing solely for superficial human-preference signals like contrast that don't improve real capability.
Get the Snipd Podcast app to discover more snips from this episode
Get the app