Latent Space: The AI Engineer Podcast cover image

Generative Video WorldSim, Diffusion, Vision, Reinforcement Learning and Robotics — ICML 2024 Part 1

Latent Space: The AI Engineer Podcast

00:00

Innovations in Video Generation: The VideoPoet Model

This chapter explores the advancements in video generation technology, particularly through the VideoPoet model that integrates video, audio, and text using a unified vocabulary. It discusses the strengths of the auto-regressive language model approach and critiques traditional diffusion models, while emphasizing the unique tokenizer and multimodal capabilities of VideoPoet. The chapter also addresses the challenges of training such models and the ongoing research necessary for improving their performance and application versatility.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app