Policy Gradient Methods vs Value Based Methods

Policy gradient methods allow you to use continuous actions, where value based methods require you to use discreet actions. The current reigning champion of the policy gradient approaches is the proximate policy optimis ation model, or p p o. And we'll compare a lot of these in future. But right here i want to namedrop the most popular from each camp.

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app