The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Genie: Generative Interactive Environments with Ashley Edwards - #696

5 snips

Aug 5, 2024

In this conversation, Ashley Edwards, a member of the technical staff at Runway with past affiliations at Google DeepMind and Uber, reveals the innovative Genie project. They discuss Genie’s ability to create interactive video environments for training reinforcement learning agents without supervision. Topics include the mechanics of latent action models, video tokenization, and dynamics modeling for frame prediction. Ashley highlights the practical implications of Genie and compares it to other models like Sora, mapping out future directions in video generation.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Genie's Motivation: Unlimited Environments

Reinforcement learning researchers often struggle to find diverse environments for training generalist agents.
Genie offers a solution by learning an unlimited source of environments from videos, eliminating the need for manual environment creation.

INSIGHT

Unsupervised World Model Learning

Genie learns world models from videos without explicit action data, enabling interaction and frame prediction.
It allows users to step into text-generated images, sketches, or real-world photos and interact as if they were real game environments.

INSIGHT

Genie's Core Components

Genie comprises three core components: a latent action model, a dynamics model, and a video tokenizer.
The latent action model learns actions from videos, the dynamics model generates future frames, and the video tokenizer creates video representations.

Get the Snipd Podcast app to discover more snips from this episode

Get the app