

Latent Space: The AI Engineer Podcast
swyx + Alessio
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.Full show notes always on https://latent.space
Episodes
Mentioned books

103 snips
Aug 22, 2023 • 59min
Cursor.so: The AI-first Code Editor — with Aman Sanger of Anysphere
Aman Sanger, founder of Abelian AI and Cursor.so, has a rich background in AI and finance, with experience at Google and McKinsey. He discusses the innovative AI-powered code editor, Cursor, which is transforming coding practices. Sanger emphasizes the need for new IDEs to push AI coding efficiency beyond current limits. He delves into the challenges of integrating AI with CAD applications and shares insights on advanced coding techniques using AI models. Throughout, he highlights the evolving landscape of AI in coding and the potential for future advancements.

33 snips
Aug 16, 2023 • 51min
The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI
Quentin Anthony, a PhD student at Ohio State University and head engineer at EleutherAI, dives into the intricacies of training large language models. He discusses the importance of community knowledge and practical strategies for GPU optimization. Quentin unpacks the mathematics behind compute requirements and addresses the challenges of floating-point operations. He also explores autoregressive modeling techniques, contrasts traditional methods, and examines the complexities of optimizing training processes, including the Atom optimizer and model distribution.

13 snips
Aug 10, 2023 • 52min
LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML
Tianqi Chen, an Assistant Professor at CMU and the innovative mind behind XGBoost and Apache TVM, dives into the world of machine learning compilation. He discusses the urgent GPU shortage and explores how to run large language models on devices without needing GPUs at all. Highlights include the groundbreaking ability to execute a 70 billion parameter model in web browsers and advancements in AMD card support, making powerful AI accessible for developers. They also tackle the importance of weight quantization and community collaboration in optimizing machine learning tools.

27 snips
Aug 4, 2023 • 59min
[AI Breakdown] Summer AI Technical Roundup: a Latent Space x AI Breakdown crossover pod!
NLW, a prominent Daily AI podcaster and YouTuber, dives into the latest AI advancements, highlighting the launch of OpenAI's Code Interpreter, seen as a leap toward GPT 4.5. He discusses its unexpected utility beyond coding, the intricate challenges in evaluating AI models, and the competitive landscape, especially with open-source tools like Llama 2. The conversation also touches on the potential of AI companions in personal growth and the evolving role of AI engineers, making it a must-listen for anyone interested in the future of technology.

64 snips
Jul 26, 2023 • 55min
FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI
Tri Dao, a recent Stanford PhD grad and Chief Scientist at Together AI, discusses his groundbreaking work on FlashAttention-2, enhancing transformer models for faster inference. He explains how FlashAttention improves efficiency by reducing memory usage from quadratic to linear scaling. The conversation also touches on the importance of memory architecture in GPU performance and the balance of traditional techniques with modern AI innovations. Lastly, Tri reflects on the dynamic landscape of AI research and the rise of open-source contributions in the field.

51 snips
Jul 19, 2023 • 1h 20min
Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)
In this discussion, guests Nathan Lambert, a machine learning researcher at Hugging Face, and Matt Bornstein from a16z, share insights on the revolutionary Llama 2 model. They explore its technical advancements, including improved context length and its arrival as a strong competitor in the open LLM landscape. Ethical concerns surrounding open-source AI, data sourcing, and user privacy come into play. The conversation highlights the potential for democratizing AI and the importance of having control over sensitive data, pivotal for businesses and organizations.

108 snips
Jul 17, 2023 • 1h 1min
AI Fundamentals: Datasets 101
The discussion kicks off with the crucial role of datasets in AI training, debunking the myth that models like GPT-3 use the entire internet for data. It emphasizes the immense effort required for quality data selection and the evolution of training methods. Key examples like Common Crawl and debates around data quality versus quantity are highlighted. Ethical concerns regarding copyright and licensing for datasets are also explored, while the importance of deduplication and data curation is underscored to enhance model accuracy.

120 snips
Jul 10, 2023 • 2h 4min
Code Interpreter == GPT 4.5 (w/ Simon Willison, Alex Volkov, Aravind Srinivas, Alex Graveley, et al.)
In this engaging discussion, experienced developer Simon Willison, AI researcher Alex Volkov, and Perplexity founder Aravind Srinivas explore the groundbreaking capabilities of the new Code Interpreter. They reveal its potential for data analysis, video editing, and refactoring tasks while addressing significant limitations and security concerns. The conversation highlights exciting applications, including sentiment analysis and game development feedback, showcasing how AI tools can optimize coding efficiency and enhance user creativity in programming.

35 snips
Jul 2, 2023 • 1h
[Practical AI] AI Trends: a Latent Space x Practical AI crossover pod!
In this engaging discussion, Dan Whitenack, a data scientist with a PhD in mathematical and computational physics and co-host of Practical AI, dives into the evolution of AI and podcasting. He shares personal anecdotes about their journey, favorite episodes, and the importance of understanding AI's historical context. The conversation shifts to implementing AI in low-resource settings, the creation of PredictionGuard, and the critical role of user experience in AI application adoption. With insights on the unique challenges faced by both engineers and data scientists, it's a lively exploration of today's AI landscape.

46 snips
Jul 1, 2023 • 2h 5min
[Cognitive Revolution] The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research
Join Ronen Eldan and Yuanzhi Li from Microsoft Research as they dive into the fascinating world of tiny language models. Learn how their Tiny Stories project showcases these models' surprising storytelling abilities while prioritizing data quality over sheer size. The duo discusses new training methods that mimic human language learning and explores the emergence of reasoning skills in AI. Discover the creative challenges of generating diverse narratives for young audiences and how understanding these small models can reshape the future of AI.