The Thesis Review cover image

[07] John Schulman - Optimizing Expectations: From Deep RL to Stochastic Computation Graphs

The Thesis Review

00:00

Reinforcement Learning in Dota: The OpenAI 5 Journey

This chapter explores the OpenAI 5 project, detailing its use of large-scale reinforcement learning and self-play to achieve victories in Dota. It discusses the evolution of gameplay from one-on-one matches to full team dynamics, highlighting the effectiveness of the PPO algorithm and the challenges of aligning reward systems in model-free approaches.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app