

Latent Space: The AI Engineer Podcast
swyx + Alessio
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.Full show notes always on https://latent.space
Episodes
Mentioned books

313 snips
Jan 17, 2026 • 1h 13min
Brex’s AI Hail Mary — With CTO James Reggio
James Reggio, CTO of Brex and leader of their AI transformation, shares his journey from mobile engineer to fintech innovator. He discusses Brex's unique three-pillar AI strategy aimed at enhancing corporate workflows, operational compliance, and customer-oriented product features. Reggio reveals how SOP-driven agents outperform traditional reinforcement learning in automating processes like KYC and underwriting. He emphasizes empowering employees to create their own AI tools and the advantages of a multi-agent network architecture in financial operations.

501 snips
Jan 9, 2026 • 1h 18min
Artificial Analysis: The Independent LLM Analysis House — with George Cameron and Micah Hill-Smith
Join George Cameron, co-founder of Artificial Analysis and benchmarking guru, along with Micah Hill-Smith, who crafted the evaluation methodology and unique benchmarks. They share their journey from a basement project to a vital tool for AI model assessment. Discover why independent evaluations matter, how their 'mystery shopper' strategy keeps benchmarks honest, and the innovative Omniscience index that prioritizes accurate responses. Learn about the evolving AI landscape and their predictions for future developments in benchmarking and transparency.

504 snips
Jan 6, 2026 • 24min
[State of Evals] LMArena's $1.7B Vision — Anastasios Angelopoulos, LMArena
Anastasios Angelopoulos, founder of LMArena, shares his journey from a Berkeley basement to a $100M valuation. He discusses why they chose to spin out as a company to scale their mission. The conversation dives into Arena's innovative approach to benchmarking AI models, the transparency of their public leaderboard, and their responses to critiques. Anastasios also reveals plans for expanding into new verticals like medicine and legal, the significance of community engagement, and the exciting shift to multimodal arenas.

489 snips
Jan 2, 2026 • 28min
[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton
Kevin Wang, an undergraduate researcher at Princeton, and Ishaan Javali, his co-author, discuss their groundbreaking work on scaling reinforcement learning networks to 1,000 layers deep, a feat previously deemed impossible. They dive into the shift from traditional reward maximization to self-supervised learning methods, highlighting architectural breakthroughs like residual connections. The duo also explores efficiency trade-offs, data collection techniques using JAX, and the implications for robotics, positioning their approach as a radical shift in reinforcement learning objectives.

318 snips
Dec 31, 2025 • 18min
[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang
Join John Yang, a Stanford PhD student and the mind behind SWE-bench and CodeClash, as he shares insights from the cutting-edge world of AI coding benchmarks. Discover how SWE-bench went from zero to industry standard in mere months, the limitations of traditional unit tests, and the innovative long-horizon tournaments of CodeClash. Yang dives into the debate around Tau-bench's 'impossible tasks' and explores the balance between autonomous agents and interactive workflows. Get ready for a glimpse into the future of human-AI collaboration!

392 snips
Dec 31, 2025 • 28min
[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI
In this engaging discussion, Josh McGrath, a post-training researcher at OpenAI, dives into the evolution of AI models from GPT-4.1 to GPT-5.1. He highlights the importance of data quality over optimization methods and explains why RLHF and RLVR are simply variations of policy gradients. Josh also shares insights on how the shopping model enhances user experience with personality toggles and the complexities involved in scaling reinforcement learning. His call for more engineers proficient in both distributed systems and ML further emphasizes the need for interdisciplinary expertise in advancing AI.

477 snips
Dec 30, 2025 • 45min
[State of RL/Reasoning] IMO/IOI Gold, OpenAI o3/GPT-5, and Cursor Composer — Ashvin Nair, Cursor
In this engaging discussion, Ashvin Nair, a researcher with a rich background in robotics and AI, shares his journey from OpenAI to Cursor. He highlights the transition from robotic challenges to the quicker impact of language models. Ashvin delves into the economic dynamics of LLMs, the importance of co-designing models and products, and the complexities of continual learning. He also explores the limitations of scaling and the need for specialized models, offering insights into the future of coding automation and the evolving landscape of AI.

386 snips
Dec 30, 2025 • 29min
[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify
Join Sarah Catanzaro, a general partner at Amplify Partners with a focus on data and AI infrastructure, as she discusses the evolving landscape of AI startups. She shares insights on the impact of the DBT-Fivetran merger and how data tools are vital for frontier labs. Sarah critiques the trend of massive seed funding without clear roadmaps while highlighting when such raises are warranted. Delve into exciting topics like memory management, personalization challenges in AI products, and the true essence of real-world training environments.

747 snips
Dec 27, 2025 • 1h 39min
One Year of MCP — with David Soria Parra and AAIF leads from OpenAI, Goose, Linux Foundation
David Soria Parra, the lead core maintainer of the Model Context Protocol (MCP) at Anthropic, shares insights from MCP’s rapid ascent in the AI world. Joined by Nick Cooper from OpenAI and Jim Zemlin, CEO of the Linux Foundation, they discuss the journey from Thanksgiving hackathons to widespread enterprise adoption. The trio explores the design challenges of ensuring interoperability between agents, the decision to join the AAIF for neutral governance, and how MCP enhances agent capabilities while maintaining flexibility and security.

1,389 snips
Dec 26, 2025 • 37min
Steve Yegge's Vibe Coding Manifesto: Why Claude Code Isn't It & What Comes After the IDE
Steve Yegge, a veteran software engineer known for his roles at Google and Amazon, dives deep into the future of coding. He argues that IDEs will soon be obsolete, pushing developers to orchestrate AI agents like NASCAR pit crews instead of writing traditional code. Steve warns against anthropomorphizing these agents, noting the risks they pose. He also discusses the growing challenge of merging in highly productive teams and predicts a world where multi-agent systems revolutionize code creation, likening it to a 'factory farming' approach.


