Scaling Up Latradic Hypotheses

Jonathan: The original experiment doesn't work once you get to a certain model complexity or task size. But there are two ways of showing how this behavior does actually scale up, and i think each of those ways ends up providing some interesting insights into how deep learning works in general. We found something quite interesting, a phenomenon were referring ous instability. So we wanted to measure how unstable is the network to the noise of stocastic radiant descent. And then we want to compare these networks somehow, to understand how much did sjiti noise really affect the optimization process by giving us how differet were these two networks. It's very exciting, because we have this ability to compare

Play episode from 16:38

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app