This week Raza is joined by Amit Jain, CEO and co-founder of Luma AI, to explore why the future of artificial intelligence lies beyond language. Amit shares Luma’s bold mission to build world models through multimodal training and why video is the most overlooked and critical data source in AI today.
Chapters:
00:00 - Introduction
03:40 - Competing with Big AI Labs: Language vs. Multimodality
08:09 - Joint Training and Why Current Multimodal Models Fall Short
11:01 - Language is Discrete, the World is Continuous
14:36 - Do These Models Have World Models?
18:18 - Planning, Counterfactuals, and Causal Reasoning in AI
22:08 - Capabilities of Ray 2 and Real-World Use Cases
26:14 - Rethinking Video Length and Creative Workflows
29:18 - Solving Coherence Across Shots and Characters
30:00 - When Will AI Create a Feature-Length Film?
31:27 - What You Can Build with Luma’s API Today
35:49 - Overlooked Ideas and Noise in the AI Industry
38:34 - Why Video is the Missing Link in AI