16 - Preparing for Debate AI with Geoffrey Irving

1

Introduction

00:00 • 3min

2

Is There a Future for Debate?

03:06 • 2min

3

The Human Interaction in Debate

04:58 • 3min

4

Is There Anything I Should Ask About Debate?

08:12 • 2min

5

The Importance of Language Models for Safety Alignment

10:22 • 2min

6

Language and Human Preferences

12:01 • 3min

7

Is the Solution to Elk Without a Solution to Scalable Alignment?

15:15 • 2min

8

Scaling Scalability to Generate Explanations

17:08 • 2min

9

Using a Linguade Model to Detect a Deficiency Problem

19:33 • 2min

10

Is There a Better Detector Quality for Detecting Failures?

21:37 • 2min

11

Is the Second Approximation a Good Idea?

23:34 • 3min

12

How to Find Failures in a Language Model?

26:16 • 3min

13

Gopher Language Models

29:10 • 2min

14

How to Generate a Red Teaming Model

31:04 • 2min

15

How to Fine Tune R R L Code Base for Language Models

33:32 • 2min

16

Teaching Language Models to Support Answers With Verified Quotes

35:24 • 2min

17

Language Model Interpretability

37:15 • 2min

18

Is There a Language Model in Isolation?

39:06 • 2min

19

Is There a Reward Model for the Answers?

40:51 • 2min

20

I Don't Have the Number for You on Hand.

42:22 • 2min

21

Do You Have Any Lessons About Human Learning?

44:21 • 2min

22

How Much Time Does It Take to Write a RPG?

46:19 • 2min

23

Uncertainly Estimation for Language Reward Models

47:58 • 2min

24

How Hard Is Uncertainty Estimation?

50:24 • 2min

25

How Much Can You Leave Fixed?

52:18 • 4min

26

How Can That Be Right?

56:30 • 3min

27

Disentangle Alutoric and Epistemic Uncertainty?

59:48 • 2min

28

Recruiting for Machine Learning and Cognitive Science

01:01:22 • 3min