
Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers
The Gradient: Perspectives on AI
The Gradient Podcast - Part 2
We're pretty optimistic that it has at least some bearing on large language models. Even if you do want to train these models, models at this scale can easily be trained on a single GPU and relatively small data sets. So for us, without small supercomputers to run GP3, does that seem like it'll help? That was really interesting for me. We need these papers.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.