Generative AI in the Real World

Emmanuel Ameisen on LLM Interpretability

8 snips
Oct 2, 2025
Emmanuel Ameisen, an interpretability researcher who previously worked at Anthropic, shares fascinating insights into large language models. He dives into how these models resemble biological systems, revealing surprising patterns like multi-token planning and shared neurons across languages. Emmanuel discusses the mechanisms behind hallucinations and the importance of model calibration. He also explores practical applications in medicine and offers invaluable advice for developers on understanding and evaluating model behavior.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Models Behave Like Grown Biological Systems

  • Language models are more like grown biological systems than hand-written programs.
  • Interpretability uses poking and probing similar to neuroscience to find functional parts.
INSIGHT

Models Plan Ahead And Share Concepts

  • Models often plan multiple tokens ahead instead of strictly predicting one token at a time.
  • The models form shared, language-agnostic concept representations like 'tall' across languages.
INSIGHT

Chains Of Thought Aren't Always Truthful

  • Reasoning-style outputs can be deceptive: the model's written chain-of-thought may not reflect real internal computation.
  • Internals can show the model guesses answers rather than performing the described steps.
Get the Snipd Podcast app to discover more snips from this episode
Get the app