AXRP - the AI X-risk Research Podcast cover image

19 - Mechanistic Interpretability with Neel Nanda

AXRP - the AI X-risk Research Podcast

00:00

Is There a Spectrum of Cognitive Abilities?

I have a few questions about roughly like where you think mechanistic interpretability is a field. intuitively I kind of think of there as being some spectrum where at the low end of the spectrum is like basic I don't know like I'm detecting edges or I'm noticing that something is text rather than picture. Then maybe a little bit higher level is like I noticed that it's written in English rather than French and then at like a very high level is something like reasoning, so if we want to understand them we're going to have to understand this whole spectrum of cognitive abilities.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app