Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
Introduction
00:00 • 2min
The Problem With Gaussian Processes in Machine Learning
02:27 • 3min
The Difference Between Supervised and Unsupervised Learning
05:00 • 2min
The Differences Between Supervised and Unsupervised Learning
06:45 • 2min
How to Build an Image Generator Using Supervised Learning
08:56 • 2min
How to Use Chat GPT to Generate Images
10:56 • 3min
The Different Types of Reward Learning
13:50 • 3min
How to Train a Dog to Do Tricks
16:42 • 4min
How AI Intervenes in the Provision of Reward
20:17 • 2min
How to Give an AI a Model of the World
22:21 • 4min
How to Operationalize Reward for an Agent
26:25 • 4min
The Importance of Random Mutation in Reward Learning
30:07 • 3min
How to Be the Most Successful, Most Advanced RL Agent
32:40 • 2min
The Importance of Advanced Artificial Agents in Reinforcement Learning
34:27 • 2min
How to Expect an Agent to Understand the World
36:38 • 4min
The Distal Model and the Proximal Model of the World
41:04 • 4min
The Problem With Reward Modeling in AI
45:10 • 5min
The Cost of Experimentation Is Relatively Small
50:13 • 2min
How to Optimize the Proximal Model of an Artificial Agent
52:15 • 2min
The Cost of a Comment Defense System
54:18 • 2min
How to Hack Yourself and Not Take Over the World
55:59 • 2min
The Importance of Intervening in Computer Programming
57:50 • 3min
The Plausibility of Reward Maximizing Behavior
01:01:01 • 6min
Theoretical Arguments for Advanced AI
01:06:34 • 2min
The Limits of Advanced RL Agents
01:08:06 • 2min
The Multi-Agent Setting
01:09:57 • 5min
How to Create a Helper Agent
01:14:27 • 2min
The Multi-Agent Scenario
01:16:03 • 3min
The Instabilities of the World
01:18:40 • 3min
The Argument for Reward Is Not the Optimization Target
01:21:39 • 2min
The Alternative Framing of RL
01:23:28 • 3min
Reward Is Not the Optimization Target
01:26:53 • 3min
The Semantic Errors in Reward
01:29:53 • 3min
The Importance of Observation in a Chatbot
01:33:06 • 4min
The Difference Between Specification Gaming and Goal Misgeneralization
01:37:14 • 2min
The Limits of Goal Misgeneralization
01:39:10 • 2min
The Evolution of Inclusive Fitness
01:41:25 • 4min
Evolution and the Future of Genetic Fitness
01:45:29 • 3min
Evolution's Failure to Optimize Human Policies for Sperm Banks
01:48:12 • 2min
How to Avoid the Bad Outcome With Reinforcement Learning
01:49:54 • 2min
How to Combine Myopic Agents With Physical Isolation
01:52:15 • 4min
The Limitations of Boxing an Agent in AI
01:56:40 • 4min
The Myopia of Bazoo
02:00:27 • 2min
How to Make an Agent More Risk Averse
02:02:54 • 3min
The Power of Pessimistic Design for an Agent
02:05:56 • 2min
The Importance of Imitative Learning in AI
02:07:53 • 2min
The Different Models of Inverse Reinforcement Learning
02:10:18 • 3min
The Importance of Quantization in Reinforcement Learning
02:13:01 • 5min
The Importance of Uncertainty in Imitating Learning
02:18:19 • 3min
Rambo: A Practical Pessimistic Agent
02:20:50 • 3min
How to Make Safe Advanced AI
02:24:10 • 2min
How to Make Pessimism Disappear and Practice
02:26:19 • 2min
How to Raise a Child From Being an Expert in RL
02:28:33 • 3min


