AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Train a Bot Using Human Data
The research was done on 50,000 games. The team tried to leverage as much self-play as possible while still leveraging the human data. We penalize the bot for choosing actions that are very unlikely under the human data set. That gives us a policy that resembles to some extent how humans actually play the game.