
Episode 19: Minqi Jiang, UCL, on environment and curriculum design for general RL agents
Generally Intelligent
00:00
The U C B Bandit and the Generalization Gap?
The performance s really amazing. I was curious, did the u c b bandit impact the generalization gaps at all? Yes. So we found that a useb dirac, the combined actually had the low externalization cap. But what's interesting is that they're almost perfectly orthognal in the sense that these experiments showed where polars achieving generalization gains through just changing the order at which you're visiting the data. They are quite different, therar Orthogonal ways of improving the generalization. And thot, i think that's a really interesting experiment to try. To try.
Transcript
Play full episode