AXRP - the AI X-risk Research Podcast

7 - Side Effects with Victoria Krakovna

May 14, 2021
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 3min
2
Is It Possible to Have an Impact Measure Like This?
03:21 • 4min
3
How to Prevent Side Effects Using Relative Reachability
07:06 • 2min
4
Is the Inaction Counterfactual a Counter Factor?
09:33 • 2min
5
Is There a Reversibility Based Impact Measure?
11:25 • 2min
6
Impact Measures to Preserve Reachability of States
13:18 • 4min
7
Impact Measures
17:38 • 5min
8
Is There a Difference in Relative Reachability in Ah?
22:39 • 2min
9
Can You Give Us a Summary of Which Formulations Are Actually Past the Most Test Cases?
24:24 • 2min
10
Reversibility Penalty
26:17 • 6min
11
Is the Reward Function Job, Instead of Job?
31:50 • 2min
12
Specifying the Positive Effects of Achieving a Task
34:04 • 3min
13
The Nuclear Power Plant Example and the Next Paper
37:29 • 3min
14
Why Do Side Effects Matter?
39:59 • 5min
15
Do You Think That the Base Line Is a Problem in the Sarcastic Case?
44:44 • 2min
16
How to Deploy a Reference Agent on a Future Task
46:21 • 2min
17
The Impact Measurement of Future Tasks
48:36 • 4min
18
Is There a Siseratum a for the Ipact Regularizer?
52:23 • 3min
19
How to Set the Impact Penalty Weight?
55:28 • 3min
20
Is There a Dependency on a State Representation?
58:20 • 3min
21
Is It Possible to Integrate Human Preferences in the Impact Measure?
01:01:31 • 2min
22
Is There a Better Impact Measure?
01:03:28 • 2min
23
Sidfacts Research - What Sub Problems Are Most Interesting to You Right Now?
01:05:02 • 3min
24
Is There a Way to Make the Agent More Interpretable?
01:08:30 • 4min
25
Is There a Scaled Up Implementation of an Impact Measure?
01:12:03 • 3min
26
Is Your Research Having Negative Consequences?
01:14:38 • 2min
27
Are You Seeing New, Exciting Results on Side Effects?
01:17:01 • 2min