
On Dwarkesh Patel's Podcast With Andrej Karpathy
Don't Worry About the Vase Podcast
00:00
How Much Training Data Is Memorized?
Discussion on massive compression from training tokens into parameters and when full context improves precision.
Play episode from 08:29
Transcript


