AI + a16z cover image

AI + a16z

Beyond Language: Inside a Hundred-Trillion-Token Video Model

Jul 3, 2024
Luma Chief Scientist Jiaming Song discusses the Dream Machine 3D model, trained on vast video data, showcasing emergent reasoning abilities. He explains the 'bitter lesson' applied to generative models and the shift towards using more compute for simpler methods. The podcast delves into the evolution of GANs, limitations of scaling language models, advancements in fine-tuning 2D models for 3D representations, and revolutionizing graphics with Dream Machine technology.
01:05:14

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Tokenizing videos presents a new challenge compared to language, requiring innovative approaches for diverse datasets.
  • Dream Machine's efficient architecture enables faster training with long sequences, showcasing the importance of design for large-scale models.

Deep dives

Model Training Data Comparison: LAMA3 vs. Dream Machine

The world's largest open source model, LAMA3, was trained on 15 trillion tokens, whereas Dream Machine v0, the smallest model, is trained on hundreds of trillions of tokens.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode