High Agency: The Podcast for AI Builders cover image

High Agency: The Podcast for AI Builders

The End of Language-Only Models l Amit Jain, Luma AI

May 13, 2025
Amit Jain, CEO and co-founder of Luma AI and former Apple Vision Pro engineer, discusses the future of AI beyond just language models. He emphasizes the importance of multimodal training, particularly the often-overlooked role of video in AI development. Amit shares insights on how combining audio, video, and text can revolutionize industries like entertainment and advertising. He also touches on the potential for fully AI-generated feature films and critiques trend-driven approaches in AI, advocating for more meaningful innovations.
40:17

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Luma AI emphasizes the need for multimodal training that integrates diverse data types for enhanced AI capabilities beyond just language.
  • The podcast highlights Luma's innovative approach with Inductive Moment Matching, enabling models to effectively process and understand various modalities simultaneously.

Deep dives

Multimodal General Intelligence Development

Luma AI aims to build multimodal general intelligence by jointly training models on diverse data types such as audio, video, language, and text. This holistic approach allows the models to learn in a way that mirrors human cognitive processes, enhancing their capability to interpret and generate content across different media. The focus on combining these modalities from the start, rather than pre-training on language alone, is presented as a revolutionary shift in AI model development. By grounding their models in a broader dataset that encompasses the entirety of humanity's digital interactions, Luma believes they can create more effective AI systems for real-world applications.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app