TalkRL: The Reinforcement Learning Podcast cover image

Danijar Hafner 2

TalkRL: The Reinforcement Learning Podcast

00:00

The Dreamer Out of the Box With No Changes

The main result in terms of capabilities was to solve the Minecraft diamond challenge from sparse rewards. That also needed a little bit of infrastructure set up and so on. Director is the first step towards that by using world models, but also training a goal-conditioned policy at the low level. And then you can use a high-level policy on top that just directs the low-level policy around.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app