
Machine Learning Street Talk (MLST)
How Do AI Models Actually Think? - Laura Ruis
Jan 20, 2025
Laura Ruis, a PhD student at University College London and researcher at Cohere, discusses her groundbreaking work on reasoning capabilities of large language models. She delves into whether these models rely on fact retrieval or procedural knowledge. The conversation highlights the influence of pre-training data on AI behavior and examines the complexities in defining intelligence. Ruis also explores the philosophical implications of AI agency and creativity, raising questions about how AI models mimic human reasoning and the potential risks they pose.
01:18:01
Episode guests
AI Summary
AI Chapters
Episode notes
Podcast summary created with Snipd AI
Quick takeaways
- Laura Ruis discusses how large language models (LLMs) leverage procedural knowledge from their training data to enhance reasoning capabilities, rather than merely relying on memorization.
- The podcast emphasizes the importance of the specific types of documents included in the training data, which significantly influence LLM performance on reasoning tasks.
Deep dives
Exploring Reasoning in Language Models
Large language models (LLMs) exhibit capabilities that resemble reasoning and fact retrieval, prompting inquiries into how performance is driven by model scale and training data. One key issue addressed is whether LLMs simply memorize information from training data or if they learn new qualitative strategies for problem-solving. The examination of zero-shot reasoning traces reveals that, while models can reproduce arithmetic steps seen during training, they might also synthesize knowledge to arrive at conclusions that were not explicitly memorized. This exploration suggests the possibility of LLMs engaging in approximate reasoning that, while not formal, exhibits generalizable problem-solving abilities.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.