3. Evan Hubinger on Takeoff speeds, Risks from learned optimization & Interpretability

1

Introduction

00:00 • 2min

2

Cocoanuts

01:56 • 2min

3

Compiling Python

04:06 • 2min

4

I Think That's a Good Thing.

05:42 • 2min

5

How Fast Do We Build Really Powerful Ai Systems?

07:24 • 5min

6

How Much Does It Cost?

12:28 • 2min

7

Gata Data Collection Procedure

14:35 • 2min

8

Is It Hard to Keep Your Basic Method Secret?

16:31 • 2min

9

How to Keep Your Weights Secret

18:13 • 3min

10

Using Ai Thay, I'm a Liman, and I'm Not Sure if It's Alined, or Not.

21:02 • 3min

11

Intent Enlightenment

23:33 • 2min

12

Is Human Coordination Really a Problem?

25:36 • 3min

13

The Human Is Like Drinking Alcohol at Tensopa

28:15 • 4min

14

Optimization Power and Quantilizes

31:46 • 2min

15

How to Apply More Oftinization Power to a Task?

34:07 • 3min

16

How to Hack the Human Brain?

36:41 • 2min

17

Is There a Strategy for Tomorrow?

38:31 • 2min

18

Defining Multile Times Intelligence

40:29 • 2min

19

What Is an Optimizer?

42:12 • 4min

20

Are We Really Proxies for What Evolution Cares About?

45:50 • 2min

21

Evolution Isn't Just Selecting for Life

47:38 • 2min

22

Why Evolution Would Rather Have Each Individual See More Altruistic?

49:42 • 2min

23

Evolution, Evolution, and Altruism in Machine Learning

51:47 • 2min

24

Issoganization and Mazes

54:04 • 3min

25

Is Inter Linement Good at Finding Green Arrows?

56:38 • 3min

26

A, Ter I Memorize, and It's Not Computed Oftima

01:00:05 • 2min

27

Is There a Sesation?

01:01:53 • 3min

28

Is It in the Right Side for You?

01:04:23 • 2min

29

Capability Rebustness

01:05:59 • 2min

30

Robustness

01:07:39 • 5min

31

Xplaying Magic

01:12:36 • 2min

32

I Think It's Resnly Accurate, I Think It Was Aly Accurate.

01:14:44 • 2min

33

Is It Worth the Gains Safety?

01:17:01 • 2min

34

Transparency Training

01:18:39 • 3min

35

Amplified Overseader and Interpretability Space

01:22:03 • 2min

36

Iithink, U Visualized Features Ind Coing Ron Anda

01:23:37 • 1min

37

Transparency Over Ampification

01:25:06 • 2min

38

Is There Any Empirical Environment or Research That Can Tell Us if Something Works?

01:26:48 • 3min

39

Imitative Amplication

01:29:20 • 2min

40

Is It Like Humans Consulting Humans?

01:31:07 • 4min

41

Recursive Reward Modeling - What's the Difference?

01:34:50 • 2min

42

The Human Feed Bag, Is It a Human Feet Bag, or a Reward Model?

01:36:49 • 3min

43

I'm Preparing to Crystalas Work on Ceays, Right?

01:39:23 • 2min

44

Dolly or Not?

01:41:03 • 3min