Reinforcement Learning in Dota: The OpenAI 5 Journey

This chapter explores the OpenAI 5 project, detailing its use of large-scale reinforcement learning and self-play to achieve victories in Dota. It discusses the evolution of gameplay from one-on-one matches to full team dynamics, highlighting the effectiveness of the PPO algorithm and the challenges of aligning reward systems in model-free approaches.

Play episode from 39:31

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app