The Core Question of OBL

OBL is aimed at people who don't easily get bored hearing about deep RL. It tries to prevent agents from developing their own communication protocols. The core problem was how can we train a policy that can play Hanabi, learns to play Hanabi from scratch, but it's not able to develop any communication protocols at all. And so the core question of OBL is, yeah, please go ahead.

Play episode from 26:59

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app