

Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time
15 snips Feb 9, 2023
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
Introduction
00:00 • 4min
How Did You Become a Researcher?
03:47 • 2min
The Future of Artificial Intelligence
05:26 • 2min
Is There a Nash Equilibrium in Poker?
07:45 • 2min
How Much of an Improvement Would You Get if You Add Search to the Poker Bot?
10:12 • 3min
Scaling Up Inference
13:34 • 3min
Is Chain of Thought a Good Idea?
16:12 • 2min
Search in Imprecise Information Games
17:53 • 3min
Do You Have a Balanced Sound Search Algorithm?
20:54 • 2min
Can You Relax Constraints Too Much?
22:30 • 2min
After You Won Poker, What Was the Reception Like?
24:16 • 2min
Poker
26:39 • 2min
Are the Poker Bots Still Beating Experts?
28:42 • 2min
Why People Don't Appreciate Search as Much?
31:06 • 2min
How to Play Perfect Information Games Like Backgammon
33:04 • 2min
The Generalization From Alpha Zero
35:00 • 5min
Counterfactual Regret Minimization
39:44 • 2min
Six Player Poker
41:23 • 2min
Is There a Nash Equilibrium?
43:09 • 2min
Is Collaboration a Problem in Two Player Poker?
44:57 • 2min
How to Modify a Value Function in a Poker Bot?
46:31 • 2min
The Hardest Game to Make an Ad for - Diplomacy
48:33 • 2min
How Do You Approach Diplomacy?
51:02 • 2min
The Key Insights for Doing Well on No Press Diplomacy
52:42 • 2min
No Press Diplomacy - Can You Model Humans Better?
54:23 • 5min
Is Your Model Modeling the Human Ability to Plan Getting Better Performance?
59:20 • 3min
No Press Diplomacy
01:02:28 • 3min
How Did You Progress From No Press to Full Press?
01:05:57 • 2min
The Dialog Model Is Used to Predict What All the Players Are Going to Do
01:08:11 • 4min
The Power of Large Language Models in the Planning Process
01:11:58 • 2min
What's Going on in a Planning Process?
01:13:29 • 2min
How Large Is the Action Space for the Policy?
01:15:29 • 2min
McTs
01:17:18 • 4min
The Challenges in Grounding the Language Model in Intensive Plans
01:21:01 • 3min
The Problem of Not Controlling the Dialogue Model
01:23:32 • 2min
What Are the Biggest Breakthroughs in Cicero?
01:25:20 • 2min
What Are Your Instincts on How to Make It More General?
01:26:58 • 3min
Is There a Next Domain?
01:29:31 • 2min
Is There Anything You've Been Excited About?
01:31:28 • 2min
How Long Will It Be Before AI Can Write a Full-Length Prize Winning Fiction Novel?
01:33:24 • 2min
Do You Have Any Controversial Opinions in the Multi-Agents RL Community?
01:35:28 • 2min
What Makes a Great Researcher?
01:37:28 • 4min
Scaling Training Costs - Is That Going to Happen in the Next Few Years?
01:41:37 • 3min