
Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time
Generally Intelligent
00:00
Are the Poker Bots Still Beating Experts?
The significance of search is something that I think has been under appreciated by the field. The first paper I wrote on Hanabi, we applied search instead of reinforcement learning. We just took a handcrafted heuristic bot that was like the baseline that everybody would beat. And we added planning like a really, really simple form of search. It's actually the dumbest possible search you can do. That got to super human performance in self play Hanabi.
Transcript
Play full episode