How to Maximize Reputation

If you're an adversary and you want ta maximize regret, you'r getto naturally go for the easier things first. For some environments, it might be a little bit trickier. We had an environment that was based on two t car racing. In that environment, we found that if you train pared on the car racing agent, it's somehow able to over explat the difference between the protagonist and the antagonist. And so you essentially end up with a very, very poor driving policy for the protagonist. It almost us to the point where just can't learn.

Play episode from 30:57

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app