
Neural Network Pruning and Training with Jonathan Frankle at MosaicML
Gradient Dissent: Conversations on AI
00:00
The Unexpected Bottlenecks in Modeling
When you're training at this scale, if it's not efficient, it's going to take the life of the universe. A lot of us settle for that quality of solution and we're okay with tens or terabytes of memory even though it requires 128 cores and gigabytes or terabytes. In the weight of advice, a lot of us say "I'm okay with tensor board, but I don't really do anything right" We know how to build better tools; there's some nicer stuff out there. It's always the stupid stuff. These are such complex systems that are so nuanced that it's those dumb mistakes that kneecap you.
Transcript
Play full episode