Future of Life Institute Podcast cover image

Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability

Future of Life Institute Podcast

CHAPTER

Reverse Engineering Induction Heads

induction heads are a general algorithm that can work on arbitrary words, but which does this very specific task. They appear in all the open source models I could find and they're really important for what we call in context learning. The ability of models to do this perfectly coincides with the dramatic bit where they're learnt.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner