Generally Intelligent cover image

Episode 19: Minqi Jiang, UCL, on environment and curriculum design for general RL agents

Generally Intelligent

00:00

Game Theory

Don't train on levels where you're just visiting those levels for the first time, and only train when you replay the levels. From a game point of view, it makes the algrithm equivalent to playing against a regret maximizing adversary. I would definitely be curious to see, like, if you had the oracle values for all the scores at e one time, and then different scalene settings and sea liht, what does that drift look like?

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app