LessWrong (Curated & Popular) cover image

“A Pragmatic Vision for Interpretability” by Neel Nanda

LessWrong (Curated & Popular)

00:00

Case Study: Sparse Auto-Encoders Lessons

Neel recounts SAE research mistakes and how proxy-task focus revealed their limited payoff versus discovery use.

Play episode from 18:55
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app