TalkRL: The Reinforcement Learning Podcast cover image

Jeff Clune

TalkRL: The Reinforcement Learning Podcast

00:00

The Inverse Dynamic Model of Learning to Play Minecraft

We train this, we call this the inverse dynamics model, the IDM. It gets to see the past in the future and tell us the action that must have happened in the middle. Then we go to a YouTube video for which we don't have actions. We run the labeler and now it gives us what it thinks is the label at every time step in that video. Okay, now we have eight years of labeled YouTube videos. We have the video and the corresponding action or these whatever we, you know, the action we think happened. Now we throw away the data labeler, which is the non-cousled bit. We train now just a normal neural net who

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app