
Episode 19: Minqi Jiang, UCL, on environment and curriculum design for general RL agents
Generally Intelligent
00:00
The Overton Window of R L Research
In supervised learning, obviously you don't want to just do super well in your training set. That's just overfitting. And so there was like a series of papers that just showed things like d c n all overfit to the training environments. If you modify tarry slightly, it breaks the agent. But one thing they notably didn't think about was that, like all their base lines, they just performed uniform sampling of all the levels for each environment. It seems like the order really matters.
Transcript
Play full episode