TalkRL: The Reinforcement Learning Podcast cover image

Jeff Clune

TalkRL: The Reinforcement Learning Podcast

CHAPTER

The Inverse Dynamic Model of Learning to Play Minecraft

We train this, we call this the inverse dynamics model, the IDM. It gets to see the past in the future and tell us the action that must have happened in the middle. Then we go to a YouTube video for which we don't have actions. We run the labeler and now it gives us what it thinks is the label at every time step in that video. Okay, now we have eight years of labeled YouTube videos. We have the video and the corresponding action or these whatever we, you know, the action we think happened. Now we throw away the data labeler, which is the non-cousled bit. We train now just a normal neural net who

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner