
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Future of Life Institute Podcast
A Transformer Is a Sequence Modeling Thing
The model is made up of these layers, these simple functions. A transformer should work on a sequence with like one word and a sequence with a thousand words. And transformers are made up of alternating attention and MLP layers. So at its heart, because the model is a sequence modeling thing, it's doing things in parallel on each word.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.