Dreamer: A Framework for Unsupervised Exploration

Director is the first algorithm that actually combines all these three aspects in at least one form. It learns a world model, it does unsupervised exploration, and it learns a goal condition policy. And empirically it works very well on sparse reward tasks. So I think there's a lot of promise there. We're starting to try it out on robots now. But the design space is much larger than that, right? There are a lot of details in how to implement these different components.

Play episode from 42:14

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app