Latent Space: The AI Engineer Podcast cover image

Generative Video WorldSim, Diffusion, Vision, Reinforcement Learning and Robotics — ICML 2024 Part 1

Latent Space: The AI Engineer Podcast

CHAPTER

Innovations in Video Generation: The VideoPoet Model

This chapter explores the advancements in video generation technology, particularly through the VideoPoet model that integrates video, audio, and text using a unified vocabulary. It discusses the strengths of the auto-regressive language model approach and critiques traditional diffusion models, while emphasizing the unique tokenizer and multimodal capabilities of VideoPoet. The chapter also addresses the challenges of training such models and the ongoing research necessary for improving their performance and application versatility.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner