How to Improve Pre-Training Loss

This is one of these places where I think theory can be useful for like there's some instinct like things learn easier functions sooner. Exactly and it's pretty simple right and then you like build more complicated mechanisms on top of that. Yeah can you like actually say something meaningful e-theritically about this yeahYeah, well the luck it literally took it hypothesis not quite like theoretical work per se but I think it tries to get at this a little bit yes. It's definitely something you should like measure and actually look at empirically. And so I think that's a good reason to like expect not to arise very early in training.

Play episode from 02:01:33

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app