What Are the High Level Trends in the Results?

In-distribution accuracy was almost like 99% in every model that we tested and all random initializations of the models reliably. But I think the overall generalization performance, the out of distribution target case was much lower. And it also had higher much higher sensitivity to the random initialisations of the models. Some follow-up work did find that just keep training the model longer can actually lead to better distribution generalization performance.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app