

3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability
Jun 8, 2021
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
Introduction
00:00 • 2min
Cocoanuts
01:56 • 2min
Compiling Python
04:06 • 2min
I Think That's a Good Thing.
05:42 • 2min
How Fast Do We Build Really Powerful Ai Systems?
07:24 • 5min
How Much Does It Cost?
12:28 • 2min
Gata Data Collection Procedure
14:35 • 2min
Is It Hard to Keep Your Basic Method Secret?
16:31 • 2min
How to Keep Your Weights Secret
18:13 • 3min
Using Ai Thay, I'm a Liman, and I'm Not Sure if It's Alined, or Not.
21:02 • 3min
Intent Enlightenment
23:33 • 2min
Is Human Coordination Really a Problem?
25:36 • 3min
The Human Is Like Drinking Alcohol at Tensopa
28:15 • 4min
Optimization Power and Quantilizes
31:46 • 2min
How to Apply More Oftinization Power to a Task?
34:07 • 3min
How to Hack the Human Brain?
36:41 • 2min
Is There a Strategy for Tomorrow?
38:31 • 2min
Defining Multile Times Intelligence
40:29 • 2min
What Is an Optimizer?
42:12 • 4min
Are We Really Proxies for What Evolution Cares About?
45:50 • 2min
Evolution Isn't Just Selecting for Life
47:38 • 2min
Why Evolution Would Rather Have Each Individual See More Altruistic?
49:42 • 2min
Evolution, Evolution, and Altruism in Machine Learning
51:47 • 2min
Issoganization and Mazes
54:04 • 3min
Is Inter Linement Good at Finding Green Arrows?
56:38 • 3min
A, Ter I Memorize, and It's Not Computed Oftima
01:00:05 • 2min
Is There a Sesation?
01:01:53 • 3min
Is It in the Right Side for You?
01:04:23 • 2min
Capability Rebustness
01:05:59 • 2min
Robustness
01:07:39 • 5min
Xplaying Magic
01:12:36 • 2min
I Think It's Resnly Accurate, I Think It Was Aly Accurate.
01:14:44 • 2min
Is It Worth the Gains Safety?
01:17:01 • 2min
Transparency Training
01:18:39 • 3min
Amplified Overseader and Interpretability Space
01:22:03 • 2min
Iithink, U Visualized Features Ind Coing Ron Anda
01:23:37 • 1min
Transparency Over Ampification
01:25:06 • 2min
Is There Any Empirical Environment or Research That Can Tell Us if Something Works?
01:26:48 • 3min
Imitative Amplication
01:29:20 • 2min
Is It Like Humans Consulting Humans?
01:31:07 • 4min
Recursive Reward Modeling - What's the Difference?
01:34:50 • 2min
The Human Feed Bag, Is It a Human Feet Bag, or a Reward Model?
01:36:49 • 3min
I'm Preparing to Crystalas Work on Ceays, Right?
01:39:23 • 2min
Dolly or Not?
01:41:03 • 3min