
Jakob Foerster
TalkRL: The Reinforcement Learning Podcast
The Core Question of OBL
OBL is aimed at people who don't easily get bored hearing about deep RL. It tries to prevent agents from developing their own communication protocols. The core problem was how can we train a policy that can play Hanabi, learns to play Hanabi from scratch, but it's not able to develop any communication protocols at all. And so the core question of OBL is, yeah, please go ahead.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.