The Achilles Heel of Reinforcement Learning

GoExplorer is a very high fidelity exploration algorithm. It's based on the original DQN paper, which launched 10,000 papers in the sense of kicking off the deep reinforcement learning revolution and putting deep mind on the map. One game in which they literally got zero was Monos and Revenge because it's very, very difficult to ever get any reward in that game through random actions.

Play episode from 30:52

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app