The Overton Window of R L Research

In supervised learning, obviously you don't want to just do super well in your training set. That's just overfitting. And so there was like a series of papers that just showed things like d c n all overfit to the training environments. If you modify tarry slightly, it breaks the agent. But one thing they notably didn't think about was that, like all their base lines, they just performed uniform sampling of all the levels for each environment. It seems like the order really matters.

Play episode from 06:19

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app