The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Latest episodes

undefined
68 snips
Mar 31, 2025 • 1h 9min

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725

In this engaging discussion, Drago Anguelov, VP of AI foundations at Waymo, sheds light on the groundbreaking integration of foundation models in autonomous driving. He explains how Waymo harnesses large-scale machine learning and multimodal sensor data to enhance perception and planning. Drago also addresses safety measures, including rigorous validation frameworks and predictive models. The conversation dives into the challenges of scaling these models across diverse driving environments and the future of AV testing through sophisticated simulations.
undefined
39 snips
Mar 24, 2025 • 51min

Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724

Join Julie Kallini, a PhD student at Stanford, as she dives into the future of language models. Discover her groundbreaking work on MrT5, a model that tackles tokenization failures and enhances efficiency for multilingual tasks. Julie discusses the creation of 'impossible languages' and the insights they offer into language acquisition and model biases. Hear about innovative architecture improvements and the importance of adapting tokenization methods for underrepresented languages. A fascinating exploration at the intersection of linguistics and AI!
undefined
140 snips
Mar 17, 2025 • 59min

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Jonas Geiping, a research group leader at the Ellis Institute and Max Planck Institute for Intelligent Systems, discusses innovative approaches to AI efficiency. He introduces a novel recurrent depth architecture that enables latent reasoning, allowing models to predict tokens with dynamic compute allocation based on difficulty. Geiping contrasts internal and verbalized reasoning in AI, explores challenges in scaling models, and highlights the architectural advantages that enhance performance in reasoning tasks. His insights pave the way for advancements in machine learning efficiency.
undefined
34 snips
Mar 10, 2025 • 42min

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Chengzu Li, a PhD student at the University of Cambridge, unpacks his pioneering work on Multimodal Visualization-of-Thought (MVoT). He explores the intersection of spatial reasoning and cognitive science, linking concepts like dual coding theory to AI. The conversation includes insights on token discrepancy loss to enhance visual and language integration, challenges in spatial problem-solving, and real-world applications in robotics and architecture. Chengzu also shares lessons learned from experiments that could redefine how machines navigate and reason about their environment.
undefined
102 snips
Mar 3, 2025 • 49min

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Niklas Muennighoff, a PhD student at Stanford, dives into his groundbreaking work on the S1 reasoning model, designed to efficiently mimic OpenAI's O1 while costing under $50 to train. He elaborates on innovative techniques like 'budget forcing' that help the model tackle complex problems more effectively. The discussion highlights the intricacies of test-time scaling, the importance of data curation, and the differences between supervised fine-tuning and reinforcement learning. Niklas also shares insights on the future of open-sourced AI models.
undefined
45 snips
Feb 24, 2025 • 1h 7min

Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720

Ron Diamant, Chief Architect for Trainium at AWS, delves into the revolutionary Trainium2 chip designed for AI and ML acceleration. He discusses its unique systolic array architecture and how it outperforms traditional GPUs in key performance dimensions. The conversation highlights the ecosystem surrounding Trainium, including the Neuron SDK and its various provisioning options. Diamant also touches upon customer adoption, performance benchmarks, and future prospects for Trainium, showcasing its pivotal role in shaping AI training and inference.
undefined
80 snips
Feb 18, 2025 • 53min

π0: A Foundation Model for Robotics with Sergey Levine - #719

In this discussion, Sergey Levine, an associate professor at UC Berkeley and co-founder of Physical Intelligence, dives into π0, a groundbreaking general-purpose robotic foundation model. He explains its innovative architecture that combines vision-language models with a novel action expert. The conversation touches on the critical balance of training data, the significance of open-sourcing, and the impressive capabilities of robots like folding laundry effectively. Levine also highlights the exciting future of affordable robotics and the potential for diverse applications.
undefined
263 snips
Feb 10, 2025 • 1h 45min

AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718

Victor Dibia, a Principal Research Software Engineer at Microsoft Research, joins to discuss the future of AI agents and multi-agent systems. He highlights how these systems surpass traditional software with their reasoning and adaptability. The conversation dives into the rise of agentic foundation models, evaluating their performance, and the growing enterprise applications. Victor also shares insights on implementing successful AI architectures, the impact on software engineering, and the importance of human-AI collaboration in navigating these advancements.
undefined
24 snips
Feb 4, 2025 • 1h 17min

Speculative Decoding and Efficient LLM Inference with Chris Lott - #717

In this discussion, Chris Lott, Senior Director of Engineering at Qualcomm AI Research, dives into the complexities of accelerating large language model inference. He details the challenges of encoding and decoding, alongside hardware constraints like memory bandwidth and performance metrics. Lott shares innovative techniques for boosting efficiency, such as KV compression and speculative decoding. He also envisions the future of AI on edge devices, emphasizing the importance of small language models and integrated orchestrators for seamless user experiences.
undefined
59 snips
Jan 28, 2025 • 52min

Ensuring Privacy for Any LLM with Patricia Thaine - #716

Patricia Thaine, co-founder and CEO of Private AI, specializes in privacy-preserving AI techniques. She dives into the critical issues of data minimization, the risks of personal data leakage from large language models (LLMs), and the challenges of redacting sensitive information across different formats. Patricia highlights the limitations of data anonymization, the balance between real and synthetic data for model training, and the evolving landscape of AI regulations like GDPR. She also discusses the ethical considerations surrounding bias in AI and the future of privacy in technology.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode