AXRP - the AI X-risk Research Podcast

24 - Superalignment with Jan Leike

23 snips
Jul 27, 2023
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
The Importance of Human-Level Automated Alignment Research
02:09 • 3min
3
How Important Is the Human Level Qualifier in Alignment Research?
05:31 • 4min
4
How to Scale Up an Automated Alignment Research Model
09:21 • 3min
5
How to Make an Automated Alignment Researcher
12:22 • 2min
6
Automating 99.9% of Alignment Research
14:19 • 2min
7
The Alignment Tax
16:14 • 2min
8
Scalable Oversight for AI Alignment Research
18:44 • 2min
9
How to Align Super Intelligence
20:40 • 2min
10
The Role of Humans in AI Alignment
22:58 • 3min
11
How to Make a Smart AI Alignment Researcher
26:16 • 4min
12
How to Scalable Oversight With Alignment Research
30:10 • 2min
13
The Importance of Using a Criticism Model in AI Alignment Research
32:36 • 2min
14
The Discriminator Criticism App
34:53 • 5min
15
Scale Below Recid - How to Measure Scale Below Recid
39:26 • 2min
16
The Problems With Automated Task Evaluation
41:17 • 2min
17
The Future of Interpretability
43:09 • 2min
18
The Importance of Interpretability in Language Models
45:26 • 4min
19
Automated Interpretability for Neurons
49:56 • 4min
20
The Importance of Scaling Interpretability
53:46 • 2min
21
The Risks of Misalignment of AI Systems
55:19 • 2min
22
How to Train Misaligned Models to Be Consistent Liars
57:42 • 2min
23
How to Train a System to Succeed
01:00:02 • 3min
24
The Core Technical Challenges of Super Intelligence Alignment
01:02:36 • 2min
25
The Four Years of AI Progress
01:04:29 • 3min
26
The Alignment Problem
01:07:06 • 3min
27
The Importance of Good Measures in Audits of AI Systems
01:10:16 • 3min
28
How the Open AI Team Is Relating to the Alignment Team
01:13:44 • 3min
29
How the Super Alignment Team Is Relating to Other Things at Open EI Like Efforts to Make Chat GPT Nicer Minimize on Our Sources
01:16:45 • 2min
30
The Importance of Collaboration in AI Research
01:18:18 • 2min
31
The Advantages of Automated Alignment Research
01:20:46 • 4min
32
OpenEI's Plan for AI Alignment
01:24:22 • 2min
33
How to Scale a New Superignment Team
01:26:12 • 2min
34
Generalization and Scalable Oversight
01:28:10 • 3min
35
The Importance of Generalization in Neural Networks
01:31:08 • 4min
36
The Interaction of Cross Validation and Interpretation
01:34:40 • 2min
37
The Importance of Cross Validation Techniques
01:36:31 • 2min
38
Neural Networks Generalize Across Languages
01:38:23 • 4min
39
How to Summon the Complexity Theoretic Definition of Love and Goodness Within the Super Alignment Team
01:42:03 • 2min
40
How to Make AI Systems More Aligned
01:43:45 • 5min
41
The Future of Alignment
01:48:30 • 5min
42
The Importance of Scalable Oversight
01:53:45 • 5min
43
The Importance of Language Models for Alignment
01:58:32 • 3min
44
How to Improve Pre-Training Loss
02:01:33 • 1min
45
The Benefits of Language Models
02:02:56 • 2min
46
How to Align Super Intelligence in Four Years
02:05:05 • 3min