TalkRL: The Reinforcement Learning Podcast cover image

Danijar Hafner 2

TalkRL: The Reinforcement Learning Podcast

CHAPTER

The Dreamer Out of the Box With No Changes

The main result in terms of capabilities was to solve the Minecraft diamond challenge from sparse rewards. That also needed a little bit of infrastructure set up and so on. Director is the first step towards that by using world models, but also training a goal-conditioned policy at the low level. And then you can use a high-level policy on top that just directs the low-level policy around.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner