Latent Space: The AI Engineer Podcast cover image

Generative Video WorldSim, Diffusion, Vision, Reinforcement Learning and Robotics — ICML 2024 Part 1

Latent Space: The AI Engineer Podcast

00:00

Advancements in Generative Models and 3D Reconstruction

This chapter explores classifier guidance and recent advancements in generative models, particularly focusing on text-to-video and text-to-image technologies like Imagine 3 and Vio. It highlights the challenges of evaluating model performance and the innovative methods for inferring 3D structures from 2D data, aiming to enhance the accuracy and efficiency of 3D reconstructions. The discussion also addresses the current limitations in dynamic scene modeling and proposes new strategies, including the Cat3D method, to improve 3D model generation.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app