24 - Superalignment with Jan Leike

1

Introduction

00:00 • 2min

2

The Importance of Human-Level Automated Alignment Research

02:09 • 3min

3

How Important Is the Human Level Qualifier in Alignment Research?

05:31 • 4min

4

How to Scale Up an Automated Alignment Research Model

09:21 • 3min

5

How to Make an Automated Alignment Researcher

12:22 • 2min

6

Automating 99.9% of Alignment Research

14:19 • 2min

7

The Alignment Tax

16:14 • 2min

8

Scalable Oversight for AI Alignment Research

18:44 • 2min

9

How to Align Super Intelligence

20:40 • 2min

10

The Role of Humans in AI Alignment

22:58 • 3min

11

How to Make a Smart AI Alignment Researcher

26:16 • 4min

12

How to Scalable Oversight With Alignment Research

30:10 • 2min

13

The Importance of Using a Criticism Model in AI Alignment Research

32:36 • 2min

14

The Discriminator Criticism App

34:53 • 5min

15

Scale Below Recid - How to Measure Scale Below Recid

39:26 • 2min

16

The Problems With Automated Task Evaluation

41:17 • 2min

17

The Future of Interpretability

43:09 • 2min

18

The Importance of Interpretability in Language Models

45:26 • 4min

19

Automated Interpretability for Neurons

49:56 • 4min

20

The Importance of Scaling Interpretability

53:46 • 2min

21

The Risks of Misalignment of AI Systems

55:19 • 2min

22

How to Train Misaligned Models to Be Consistent Liars

57:42 • 2min

23

How to Train a System to Succeed

01:00:02 • 3min

24

The Core Technical Challenges of Super Intelligence Alignment

01:02:36 • 2min

25

The Four Years of AI Progress

01:04:29 • 3min

26

The Alignment Problem

01:07:06 • 3min

27

The Importance of Good Measures in Audits of AI Systems

01:10:16 • 3min

28

How the Open AI Team Is Relating to the Alignment Team

01:13:44 • 3min

29

How the Super Alignment Team Is Relating to Other Things at Open EI Like Efforts to Make Chat GPT Nicer Minimize on Our Sources

01:16:45 • 2min

30

The Importance of Collaboration in AI Research

01:18:18 • 2min

31

The Advantages of Automated Alignment Research

01:20:46 • 4min

32

OpenEI's Plan for AI Alignment

01:24:22 • 2min

33

How to Scale a New Superignment Team

01:26:12 • 2min

34

Generalization and Scalable Oversight

01:28:10 • 3min

35

The Importance of Generalization in Neural Networks

01:31:08 • 4min

36

The Interaction of Cross Validation and Interpretation

01:34:40 • 2min

37

The Importance of Cross Validation Techniques

01:36:31 • 2min

38

Neural Networks Generalize Across Languages

01:38:23 • 4min

39

How to Summon the Complexity Theoretic Definition of Love and Goodness Within the Super Alignment Team

01:42:03 • 2min

40

How to Make AI Systems More Aligned

01:43:45 • 5min

41

The Future of Alignment

01:48:30 • 5min

42

The Importance of Scalable Oversight

01:53:45 • 5min

43

The Importance of Language Models for Alignment

01:58:32 • 3min

44

How to Improve Pre-Training Loss

02:01:33 • 1min

45

The Benefits of Language Models

02:02:56 • 2min

46

How to Align Super Intelligence in Four Years

02:05:05 • 3min