RSam Podcast cover image

Mechanistic Interpretability and How LLMs Understand

RSam Podcast

00:00

Introduction to mechanistic interpretability

Pierre defines mechanistic interpretability, linear representation and superposition hypotheses, and latent feature directions.

Play episode from 20:13
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app