Episode 27: Noam Brown, FAIR, on achieving human-level performance in poker and Diplomacy, and the power of spending compute at inference time

1

Introduction

00:00 • 4min

2

How Did You Become a Researcher?

03:47 • 2min

3

The Future of Artificial Intelligence

05:26 • 2min

4

Is There a Nash Equilibrium in Poker?

07:45 • 2min

5

How Much of an Improvement Would You Get if You Add Search to the Poker Bot?

10:12 • 3min

6

Scaling Up Inference

13:34 • 3min

7

Is Chain of Thought a Good Idea?

16:12 • 2min

8

Search in Imprecise Information Games

17:53 • 3min

9

Do You Have a Balanced Sound Search Algorithm?

20:54 • 2min

10

Can You Relax Constraints Too Much?

22:30 • 2min

11

After You Won Poker, What Was the Reception Like?

24:16 • 2min

12

Poker

26:39 • 2min

13

Are the Poker Bots Still Beating Experts?

28:42 • 2min

14

Why People Don't Appreciate Search as Much?

31:06 • 2min

15

How to Play Perfect Information Games Like Backgammon

33:04 • 2min

16

The Generalization From Alpha Zero

35:00 • 5min

17

Counterfactual Regret Minimization

39:44 • 2min

18

Six Player Poker

41:23 • 2min

19

Is There a Nash Equilibrium?

43:09 • 2min

20

Is Collaboration a Problem in Two Player Poker?

44:57 • 2min

21

How to Modify a Value Function in a Poker Bot?

46:31 • 2min

22

The Hardest Game to Make an Ad for - Diplomacy

48:33 • 2min

23

How Do You Approach Diplomacy?

51:02 • 2min

24

The Key Insights for Doing Well on No Press Diplomacy

52:42 • 2min

25

No Press Diplomacy - Can You Model Humans Better?

54:23 • 5min

26

Is Your Model Modeling the Human Ability to Plan Getting Better Performance?

59:20 • 3min

27

No Press Diplomacy

01:02:28 • 3min

28

How Did You Progress From No Press to Full Press?

01:05:57 • 2min

29

The Dialog Model Is Used to Predict What All the Players Are Going to Do

01:08:11 • 4min

30

The Power of Large Language Models in the Planning Process

01:11:58 • 2min

31

What's Going on in a Planning Process?

01:13:29 • 2min

32

How Large Is the Action Space for the Policy?

01:15:29 • 2min

33

McTs

01:17:18 • 4min

34

The Challenges in Grounding the Language Model in Intensive Plans

01:21:01 • 3min

35

The Problem of Not Controlling the Dialogue Model

01:23:32 • 2min

36

What Are the Biggest Breakthroughs in Cicero?

01:25:20 • 2min

37

What Are Your Instincts on How to Make It More General?

01:26:58 • 3min

38

Is There a Next Domain?

01:29:31 • 2min

39

Is There Anything You've Been Excited About?

01:31:28 • 2min

40

How Long Will It Be Before AI Can Write a Full-Length Prize Winning Fiction Novel?

01:33:24 • 2min

41

Do You Have Any Controversial Opinions in the Multi-Agents RL Community?

01:35:28 • 2min

42

What Makes a Great Researcher?

01:37:28 • 4min

43

Scaling Training Costs - Is That Going to Happen in the Next Few Years?

01:41:37 • 3min