The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743

114 snips
Aug 19, 2025
In this engaging discussion, Jack Parker-Holder and Shlomi Fruchter, both researchers at Google DeepMind, dive into Genie 3, a groundbreaking model that creates playable virtual worlds. They explore the evolution of world models in AI, emphasizing their importance for decision-making and planning. The duo sheds light on Genie 3’s real-time interactivity, visual memory capabilities, and the challenges faced in its development. They also touch on the innovative concept of promptability, showcasing how the model can dynamically manipulate virtual environments, paving the way for exciting applications.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

World Models As General Simulators

  • A world model predicts future states given past observations and actions, enabling planning and counterfactual reasoning.
  • Jack Parker-Holder reframes world models as general simulators of dynamics, not just single-MDP predictors.
ANECDOTE

Early Proof: Latent Actions Made Playable Worlds

  • Genie 1 proved the foundation world model idea using unlabeled videos and learned latent actions to make worlds playable.
  • Jack Parker-Holder recounts training small models for 2D platformers and robot arms that worked but had short duration and mode collapse issues.
INSIGHT

Diffusion And Scale Extended Temporal Coherence

  • Scaling to 3D data and diffusion methods in Genie 2 extended generation to 360p and tens of seconds of consistent simulation.
  • Jack Parker-Holder notes Genie 2 showed viability for larger 3D worlds but still lacked real-time interactivity and text-based prompting.
Get the Snipd Podcast app to discover more snips from this episode
Get the app