
Latent Space: The AI Engineer Podcast
The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.Full show notes always on https://latent.space
Latest episodes

223 snips
Sep 13, 2024 • 2h 4min
From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team
Michelle Pokrass leads the API Platform at OpenAI and has an impressive background in building scalable platforms at leading tech companies. In their discussion, they delve into the significance of structured outputs in AI, emphasizing its reliability for developers. The conversation covers everything from the latest advancements in OpenAI's capabilities, including the O1 model, to navigating database challenges and enhancing user experience through innovations in their APIs. They also touch on the complexities of cognitive biases in decision-making related to AI.

8 snips
Sep 3, 2024 • 1h 5min
Efficiency is Coming: 3000x Faster, Cheaper, Better AI Inference from Hardware Improvements, Quantization, and Synthetic Data Distillation
Nyla Worker, a Senior PM at Nvidia with a background in optimizing AI models at Google and eBay, shares insights on dramatic advancements in AI efficiency and inference. The discussion highlights a staggering reduction in costs and time for training models, with examples like the Cerebras platform achieving unheard-of speeds. They delve into optimizing large language models and the revolutionary potential of 3D conversational AI technology. Worker also touches on the future of digital personas and their applications in various sectors, including healthcare.

78 snips
Aug 29, 2024 • 1h 10min
Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind
Nicholas Carlini, a research scientist at DeepMind specializing in AI security, discusses the power of personalized LLM benchmarks. He encourages focusing on individual use of AI tools, emphasizing that AI shines in automating mundane tasks. Carlini shares insights from his viral blog, detailing creative applications of AI in coding and problem-solving. He also navigates the dualities of LLMs, the importance of critical evaluation, and the ongoing need for robust, domain-specific benchmarks to truly gauge AI performance.

53 snips
Aug 22, 2024 • 1h 5min
Is finetuning GPT4o worth it? — with Alistair Pullen, Cosine (Genie)
Alistair Pullen, Co-founder and CEO of Cosign, discusses the groundbreaking advancements of Cosine Genie, the top coding agent that utilizes fine-tuned GPT-4o technology. He shares insights on the innovative training techniques that enable the model to learn from real software engineers, enhancing coding efficiency. The conversation also delves into the challenges of fine-tuning models, the importance of synthetic data, and future innovations in AI tooling, revealing the transformative potential of advanced language models in software development.

100 snips
Aug 16, 2024 • 59min
AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai
Jeremy Howard, Founder of Answer.ai and a prominent figure in deep learning and fast.ai, joins the conversation to share innovative insights. He discusses revolutionary AI model training techniques that allow anyone to use minimal resources to achieve maximum output. Howard emphasizes collaboration within diverse teams, steering clear of traditional hierarchies, to foster creativity. They also explore the FastHTML framework, showcasing how it simplifies web development. The podcast dives into the ethics surrounding AI governance and the promise of dialogue engineering in transforming coding environments.

15 snips
Aug 7, 2024 • 1h 4min
Segment Anything 2: Demo-first Model Development
Joseph Nelson, a computer vision expert at Roboflow, and Nikhila Ravi, Research Engineering Manager at Facebook AI, share their insights on the groundbreaking Segment Anything Model 2 (SAM2). They discuss its remarkable efficiency in video segmentation, achieving better accuracy with significantly fewer interactions. The conversation highlights the model's revolutionary role in real-time object tracking and its open-source commitment. They also touch on the importance of user-friendly demonstrations and community involvement in evolving AI technologies.

80 snips
Aug 2, 2024 • 1h 55min
The Winds of AI Winter (Q2 Four Wars Recap) + ChatGPT Voice Mode Preview
Dive into the world of AI advancements as hosts celebrate their milestones and discuss the Sovereign AI Summit in Singapore. Explore how GPU-rich models like Llama 3.1 and Mistral Large are reshaping the landscape. Unpack the intriguing dynamics of synthetic data and the evolving competitive sphere beyond OpenAI. Engage in playful explorations of emotional expression through accents and voice modulation, while also uncovering the challenges of capturing tonal nuances in AI technology. Riddles and humor add a delightful twist to the conversation!

54 snips
Jul 23, 2024 • 1h 5min
Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI
In this engaging discussion with Thomas Scialom, a leading mind behind Llama 2 and Llama 3 at Meta, listeners dive into the fascinating world of synthetic data and reinforcement learning techniques. He reveals how Llama 3 excels with 15T tokens, leveraging primarily synthetic content for training efficiency. The importance of evaluation methods and the balance between human feedback and model training strategies takes center stage. Scialom also shares insights on the future of intelligence with advanced, multi-step agents and the evolving landscape of AI innovation.

67 snips
Jul 12, 2024 • 58min
Benchmarks 201: Why Leaderboards > Arenas >> LLM-as-Judge
Clémentine Fourrier, lead maintainer of Hugging Face’s OpenLLM Leaderboard, shares her journey from geology to AI. She discusses the urgent need for standardized benchmarks in model evaluations as traditional metrics become outdated. Clémentine tackles the challenges of creating fair, community-driven assessments while addressing biases and resource limitations. She also highlights innovations like long-context reasoning benchmarks and predicts future advancements in LLM capabilities, emphasizing the importance of calibration for user trust.

23 snips
Jul 5, 2024 • 1h 45min
The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
Yi Tay, Chief Scientist at Reka AI and former tech lead at Google Brain, shares insights on the rapidly evolving landscape of AI. He discusses the challenges faced by smaller model labs, emphasizing Reka Core's impressive debut on the LMsys leaderboard. Yi also reflects on the importance of identifying crucial research problems and maintaining a long-term vision. Other topics include the impact of social media on research visibility and the balance between academic life and startup initiatives in AI development.