Train Agents on Atari

The goal of that setting can be stated as learning a traditional RL algorithm. Then the other axes of that setting is whether we're dealing with a single task or a multi-task setting. This isn't something that is super often discussed in, especially in some parts of the metarolature but the single-task case is still very interesting and it actually does. So you can train agents on Atari where you actually metarolerned the objective function for the policy gradient that's then updating your policy just on that single task.

Play episode from 11:37

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app