AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Offline Deep Learning
Scholand: I think we've made a lot of progress on offline RL. But there are major challenges still to address, and these major challenges follow the two broad categories. The first category has to do with something that's not really unique to offline RL; it's a problem for all RL methods. So our own methods, not just offline RL, all of them, are harder to use than supervised learning methods. A big part of why they're harder to use is that, for example, with value-based methods like Q-learning, they are not actually equivalent to grading descent.