Chris Lattner of Modular (https://modular.com) joined us (again!) to talk about how they are breaking the CUDA monopoly, what it took to match NVIDIA performance with AMD, and how they are building a company of "elite nerds".
X: https://x.com/latentspacepod
Substack: https://latent.space
00:00:00 Introductions
00:00:12 Overview of Modular and the Shape of Compute
00:02:27 Modular’s R&D Phase
00:06:55 From CPU Optimization to GPU Support
00:11:14 MAX: Modular’s Inference Framework
00:12:52 Mojo Programming Language
00:18:25 MAX Architecture: From Mojo to Cluster-Scale Inference
00:29:16 Open Source Contributions and Community Involvement
00:32:25 Modular's Differentiation from VLLM and SGLang
00:41:37 Modular’s Business Model and Monetization Strategy
00:53:17 DeepSeek’s Impact and Low-Level GPU Programming
01:00:00 Inference Time Compute and Reasoning Models
01:02:31 Personal Reflections on Leading Modular
01:08:27 Daily Routine and Time Management as a Founder
01:13:24 Using AI Coding Tools and Staying Current with Research
01:14:47 Personal Projects and Work-Life Balance
01:17:05 Hiring, Open Source, and Community Engagement