AXRP - the AI X-risk Research Podcast cover image

19 - Mechanistic Interpretability with Neel Nanda

AXRP - the AI X-risk Research Podcast

00:00

Is There a Spectrum of Cognitive Abilities?

I have a few questions about roughly like where you think mechanistic interpretability is a field. intuitively I kind of think of there as being some spectrum where at the low end of the spectrum is like basic I don't know like I'm detecting edges or I'm noticing that something is text rather than picture. Then maybe a little bit higher level is like I noticed that it's written in English rather than French and then at like a very high level is something like reasoning, so if we want to understand them we're going to have to understand this whole spectrum of cognitive abilities.

Play episode from 24:15
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app