Generally Intelligent cover image

Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time

Generally Intelligent

00:00

How Large Is the Action Space for the Policy?

If you ignore the language, the action space is still huge. The branching factor is something like 10 to 20. And that's also ignoring the language. So we sample from the policy net a set of actions for each player that are highly likely. We iterate 256 times and we end up with hopefully and in practice a better prediction of what everybody's going to do than just with the invitation learned policy.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app