Offline Deep Learning

Scholand: I think we've made a lot of progress on offline RL. But there are major challenges still to address, and these major challenges follow the two broad categories. The first category has to do with something that's not really unique to offline RL; it's a problem for all RL methods. So our own methods, not just offline RL, all of them, are harder to use than supervised learning methods. A big part of why they're harder to use is that, for example, with value-based methods like Q-learning, they are not actually equivalent to grading descent.

Play episode from 45:10

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app