AXRP - the AI X-risk Research Podcast cover image

19 - Mechanistic Interpretability with Neel Nanda

AXRP - the AI X-risk Research Podcast

00:00

Short Text Learning and Induction Heads

induction heads recur in models at all scales we've studied up to about 13 billion parameters and they seem to be core to this capability of in-context learning using far back words in the text to usefully predict the next token. They are the main mechanism by which this happens it is also the case that they all seem to occur during a fairly narrow band of training in this phenomena we call the superposition.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app