

Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning
4 snips Nov 17, 2022
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Introduction
00:00 • 3min
How Did You Get Interested in Machine Learning Research?
03:08 • 2min
Learning Unsupervised Learning
05:27 • 2min
I Love That Simple Is Good
07:41 • 2min
Unsupervised Learning - The Steeper Game
09:21 • 3min
The Fall Option of the Paper
11:59 • 2min
Is There a Flaw in the Robot Design?
13:49 • 2min
Unsupervised Reinforcement Learning
15:53 • 2min
Robotic Learning and Simulation - Is That the Future?
17:41 • 2min
Learning the Doubt Regarding Humans
19:51 • 2min
How to Maximize Your Reward?
21:33 • 2min
Do You Want to Learn How to Get a PhD for Repetition Learning?
23:44 • 2min
Is Model Based Learning Compatible With Model Based RL?
25:30 • 3min
How to Generate a General Agent
28:43 • 2min
How to Predict What States a Human Might Visit?
30:36 • 3min
Human-Eyed Robots?
33:07 • 3min
How Do You Best Use Human Supervision for Learning and Specification?
36:05 • 2min
Learning Robotics
38:27 • 2min
Is Distribution Matching a Natural Learning Approach?
40:29 • 2min
Optimize Infinite Horizon Discounting?
42:18 • 2min
Is There a Good Time to Do Something?
44:35 • 2min
How Robust Is Quail?
46:07 • 2min
Adonimization - How to Make It Solve Things Faster?
48:20 • 2min
Is There Really a Goal Condition RL Thing?
50:11 • 2min
Learning a Gun Like Optimization to Decide Whether a State Is Visited by an Expert or by Your Boss
52:15 • 2min
Is There a Reversible State?
54:13 • 2min
The Variational Empowerment Paper
56:14 • 3min
Language Models
59:39 • 2min
Is Language a Substrat of Thought?
01:01:39 • 2min
The Grid Sentence Book - Is That Really Influential?
01:03:37 • 2min
What Are the Bottlenecks in the Field of Research?
01:06:07 • 3min
Scaling Robotics
01:09:00 • 2min
Do You Have Any Controversial Research Opinions?
01:11:23 • 2min
Is There a Master Objective in a Language Model?
01:13:28 • 2min
I Agree With the Random Objective, but It Doesn't Write Objective.
01:15:16 • 2min
Is It Important for Intelligence?
01:16:47 • 2min
Sensory Motor Feedback
01:18:28 • 3min
Is There a Difference Between Learning and Learning?
01:21:19 • 2min
Is There a Better Structure for Research?
01:23:05 • 2min
What Are the Differences Between Team Meetings and One-on-One Advisor Meetings?
01:25:33 • 2min
Are There Any Mistakes That You've Made as a Researcher?
01:27:53 • 2min
Learning a Gradient Signal in a Neural Network
01:30:07 • 2min
Is Scaling a Blocker to General Agents?
01:31:47 • 2min
Is Scaling Up Models Better Than Humans?
01:33:50 • 3min
Is Burning Man a Good Example?
01:36:34 • 2min