Generally Intelligent cover image

Episode 19: Minqi Jiang, UCL, on environment and curriculum design for general RL agents

Generally Intelligent

00:00

How to Maximize Reputation

If you're an adversary and you want ta maximize regret, you'r getto naturally go for the easier things first. For some environments, it might be a little bit trickier. We had an environment that was based on two t car racing. In that environment, we found that if you train pared on the car racing agent, it's somehow able to over explat the difference between the protagonist and the antagonist. And so you essentially end up with a very, very poor driving policy for the protagonist. It almost us to the point where just can't learn.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app