The Inside View

3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability

Jun 8, 2021
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Cocoanuts
01:56 • 2min
3
Compiling Python
04:06 • 2min
4
I Think That's a Good Thing.
05:42 • 2min
5
How Fast Do We Build Really Powerful Ai Systems?
07:24 • 5min
6
How Much Does It Cost?
12:28 • 2min
7
Gata Data Collection Procedure
14:35 • 2min
8
Is It Hard to Keep Your Basic Method Secret?
16:31 • 2min
9
How to Keep Your Weights Secret
18:13 • 3min
10
Using Ai Thay, I'm a Liman, and I'm Not Sure if It's Alined, or Not.
21:02 • 3min
11
Intent Enlightenment
23:33 • 2min
12
Is Human Coordination Really a Problem?
25:36 • 3min
13
The Human Is Like Drinking Alcohol at Tensopa
28:15 • 4min
14
Optimization Power and Quantilizes
31:46 • 2min
15
How to Apply More Oftinization Power to a Task?
34:07 • 3min
16
How to Hack the Human Brain?
36:41 • 2min
17
Is There a Strategy for Tomorrow?
38:31 • 2min
18
Defining Multile Times Intelligence
40:29 • 2min
19
What Is an Optimizer?
42:12 • 4min
20
Are We Really Proxies for What Evolution Cares About?
45:50 • 2min
21
Evolution Isn't Just Selecting for Life
47:38 • 2min
22
Why Evolution Would Rather Have Each Individual See More Altruistic?
49:42 • 2min
23
Evolution, Evolution, and Altruism in Machine Learning
51:47 • 2min
24
Issoganization and Mazes
54:04 • 3min
25
Is Inter Linement Good at Finding Green Arrows?
56:38 • 3min
26
A, Ter I Memorize, and It's Not Computed Oftima
01:00:05 • 2min
27
Is There a Sesation?
01:01:53 • 3min
28
Is It in the Right Side for You?
01:04:23 • 2min
29
Capability Rebustness
01:05:59 • 2min
30
Robustness
01:07:39 • 5min
31
Xplaying Magic
01:12:36 • 2min
32
I Think It's Resnly Accurate, I Think It Was Aly Accurate.
01:14:44 • 2min
33
Is It Worth the Gains Safety?
01:17:01 • 2min
34
Transparency Training
01:18:39 • 3min
35
Amplified Overseader and Interpretability Space
01:22:03 • 2min
36
Iithink, U Visualized Features Ind Coing Ron Anda
01:23:37 • 1min
37
Transparency Over Ampification
01:25:06 • 2min
38
Is There Any Empirical Environment or Research That Can Tell Us if Something Works?
01:26:48 • 3min
39
Imitative Amplication
01:29:20 • 2min
40
Is It Like Humans Consulting Humans?
01:31:07 • 4min
41
Recursive Reward Modeling - What's the Difference?
01:34:50 • 2min
42
The Human Feed Bag, Is It a Human Feet Bag, or a Reward Model?
01:36:49 • 3min
43
I'm Preparing to Crystalas Work on Ceays, Right?
01:39:23 • 2min
44
Dolly or Not?
01:41:03 • 3min