
Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers
The Gradient: Perspectives on AI
Getting Traction in the MLP Layers
The researchers have had a lot of success understanding the attention mechanism in Transformers. They are focusing their next thread of research on what is going on in the MLP layers. And so we're going back to basics and exploring, can we understand why they're challenging? We have another paper that will be coming out sometime soon where we look at some other toy models that aren't transformers at all, but very toy models.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.