
19 - Mechanistic Interpretability with Neel Nanda
AXRP - the AI X-risk Research Podcast
Is There a Spectrum of Cognitive Abilities?
I have a few questions about roughly like where you think mechanistic interpretability is a field. intuitively I kind of think of there as being some spectrum where at the low end of the spectrum is like basic I don't know like I'm detecting edges or I'm noticing that something is text rather than picture. Then maybe a little bit higher level is like I noticed that it's written in English rather than French and then at like a very high level is something like reasoning, so if we want to understand them we're going to have to understand this whole spectrum of cognitive abilities.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.