Scaling and Data Printing: The Future of Machine Learning

There have been all these algorithmic and architectural advances in machine learning over the last 10 years but the dominant trend has been simply scaling things up, training larger and larger models on larger and larger data sets. And so what I mean by that is if you look at the models error on some task that error tends to fall off like the number of training examples you give it to some exponent. That's a power law. And so this is really good news in the sense that performance just improves consistently - we can expect to keep improving our models. But it's bad news in thesense that power laws are really slow. If you want to shave that 2% down to 1% test error you

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app