
Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time
Generally Intelligent
How to Modify a Value Function in a Poker Bot?
Rebel's search algorithm is able to deal with these high dimensional continuous state and action spaces. It uses a neural network value function that takes as input the belief distribution over what cards each player has. This had actually been done before, so there was a paper in 2017 from the University of Alberta called Deepstack where they first developed this technique earlier.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.