AXRP - the AI X-risk Research Podcast

17 - Training for Very High Reliability with Daniel Ziegler

Aug 21, 2022
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Is a G I Allinment Hard?
02:00 • 2min
3
Scalable Oversight - Is Scalability a Good Idea?
04:20 • 2min
4
Scalable Oversight
06:03 • 2min
5
The Contribution of Adversaa Training
07:59 • 2min
6
Are We Losing Performance Competitiveness by Putting Catastrophic Failures Here?
09:58 • 2min
7
Is the Catastrophe Measure Better Than the Quality Measure?
11:45 • 3min
8
How Do You Think About These Two Metrics?
14:20 • 3min
9
Is There a Metric to Capturing a Failure?
17:29 • 2min
10
Is the Generator Trying to Make Things Any Worse?
19:26 • 2min
11
How Do We Get More Out of Our Time?
21:17 • 3min
12
Is It a Fible Rout, or a R for N L P Attack?
24:14 • 2min
13
Is There a Gradient Barrier to Learning?
26:20 • 2min
14
Is There a Problem With Token Substitution?
28:21 • 4min
15
Is There Anything You Didn't Try That Worked?
32:27 • 2min
16
What Does Quality Mean?
34:31 • 2min
17
Do You Think It Makes a Difference?
36:33 • 2min
18
Is It a Fanfic or Something?
38:18 • 2min
19
The Effects of a Random Mistake on the Quality of the Language Model
39:54 • 2min
20
Is There a Violence in Alice in Wonderland?
41:25 • 3min
21
Rejection Sampled Snippets - Is This the Correct Estimator?
44:09 • 5min
22
What Is Redwood Research?
49:29 • 2min
23
Are You Less Excited About Deconfusion Research?
51:41 • 2min
24
Is There Any Research on Scalable Oversight?
53:54 • 2min
25
Is There a Relationship Between the Interpretability Team and the Adicator?
56:20 • 2min
26
The Carreer Acproaching at 80 Thousand Hours
58:18 • 3min