Google DeepMind: The Podcast

Genie 3: An infinite world model with Shlomi Fruchter and Jack Parker-Holder

48 snips
Aug 22, 2025
In this fascinating discussion, Jack Parker-Holder, a Research Scientist at Google DeepMind, and Shlomi Fruchter, a Research Director, unveil Genie 3, a groundbreaking world model that creates interactive environments from text prompts. They explore how this model surpasses traditional video generation by enabling engaging, dynamic simulations. Topics include the implications for education and training, the evolution of AI models, and the physics understanding in simulations. Their insights reveal the exciting potential of AI to drive immersive experiences and enhance human interactions.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Real-Time Autoregressive World Model

  • Genie 3 is an autoregressive world model that generates each pixel in real time from past frames and user inputs.
  • This lets it produce flexible, explorable worlds without a traditional game engine.
ANECDOTE

Controlling A Cat Through A Generated Apartment

  • Shlomi demoed controlling a ginger cat in a furnished apartment using keyboard inputs to move and look around.
  • The model updated lighting and scene details in real time as the cat moved toward sunlight.
INSIGHT

Video vs. Interactive Generation

  • Video models like Veo output a fixed video and cannot be freely explored or changed after generation.
  • Genie 3 instead generates frames autoregressively so the future isn't predecided and remains interactive.
Get the Snipd Podcast app to discover more snips from this episode
Get the app