TalkRL: The Reinforcement Learning Podcast

Rohin Shah

Apr 12, 2022
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Reward Functions in Reaper Learning
01:55 • 5min
3
The Top Two Approaches
06:35 • 2min
4
Is It Possible to Learn the Concept of a Waterfall?
08:19 • 2min
5
Learning From Human Feedback
10:27 • 2min
6
Recommenders and Recommendation Systems
12:37 • 3min
7
Defisef - Is There a Robotic Control System?
15:23 • 4min
8
The Relationship Between Inverse Reenforcement Learning and Reward Learning
19:03 • 3min
9
I'll Say This - Sorry, Sorry, I Don't Think It Was at All
21:41 • 3min
10
Is the World Not How We Want It to Be?
24:36 • 2min
11
Learning From Human Feedback Is Better Than Reward Learning
27:03 • 4min
12
The Benefits of Assistive Robotics
30:49 • 2min
13
The Highest Level of Task Complexity in Deep Learning
33:13 • 2min
14
What's the Difference Between Assistance and Learning?
35:16 • 2min
15
Assistants Paradime - What Are the Benefits of Active Learning?
36:54 • 5min
16
The Reward Learning Paradime
42:20 • 2min
17
Inter Active Reward Learning
43:59 • 2min
18
Algorithms That Optimize Over Assistance
46:05 • 2min
19
The Next Paper on the Utility of Learning About Humans for Human Ai Coordination
47:49 • 4min
20
How to Get Robustness to Environments in Collaborative Games
51:46 • 3min
21
Using Humans in the Training Loop, Is That Really the Case?
54:39 • 3min
22
How to Evaluate the Robustness of Collaborative Agents
57:58 • 2min
23
Test Distributions
59:39 • 2min
24
The Three Types of Robustness in Reactor Learning Agents
01:01:12 • 2min
25
Scaling Up Deep Learning
01:02:59 • 3min
26
Is There a Canonical Definition of Air Linement?
01:05:59 • 3min
27
Ai Systems - How Does Alignment Relate to Ai Safety?
01:08:42 • 5min
28
What's Happening With Your Alignment News Letter?
01:14:03 • 3min
29
Alignment News Letter - I Highly Recommend It
01:16:40 • 5min
30
The Alignment Forum - Is That Right?
01:21:46 • 2min
31
Are There Any Alignment Issues in Science Fiction?
01:23:59 • 2min
32
How to Draw More Attention From the Academic Community?
01:25:55 • 3min
33
How to Approach the Alignment Problem When Faced With Heterogeneous Behaviors
01:28:30 • 3min
34
How Do We Best Handle Bias When Learning From Human Expert Demonstrations?
01:31:04 • 2min
35
The Holy Grail for Ai Systems Training
01:32:54 • 2min
36
Do You Have a Research Career Plan?
01:34:34 • 2min