Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning

1

Introduction

00:00 • 3min

2

How Did You Get Interested in Machine Learning Research?

03:08 • 2min

3

Learning Unsupervised Learning

05:27 • 2min

4

I Love That Simple Is Good

07:41 • 2min

5

Unsupervised Learning - The Steeper Game

09:21 • 3min

6

The Fall Option of the Paper

11:59 • 2min

7

Is There a Flaw in the Robot Design?

13:49 • 2min

8

Unsupervised Reinforcement Learning

15:53 • 2min

9

Robotic Learning and Simulation - Is That the Future?

17:41 • 2min

10

Learning the Doubt Regarding Humans

19:51 • 2min

11

How to Maximize Your Reward?

21:33 • 2min

12

Do You Want to Learn How to Get a PhD for Repetition Learning?

23:44 • 2min

13

Is Model Based Learning Compatible With Model Based RL?

25:30 • 3min

14

How to Generate a General Agent

28:43 • 2min

15

How to Predict What States a Human Might Visit?

30:36 • 3min

16

Human-Eyed Robots?

33:07 • 3min

17

How Do You Best Use Human Supervision for Learning and Specification?

36:05 • 2min

18

Learning Robotics

38:27 • 2min

19

Is Distribution Matching a Natural Learning Approach?

40:29 • 2min

20

Optimize Infinite Horizon Discounting?

42:18 • 2min

21

Is There a Good Time to Do Something?

44:35 • 2min

22

How Robust Is Quail?

46:07 • 2min

23

Adonimization - How to Make It Solve Things Faster?

48:20 • 2min

24

Is There Really a Goal Condition RL Thing?

50:11 • 2min

25

Learning a Gun Like Optimization to Decide Whether a State Is Visited by an Expert or by Your Boss

52:15 • 2min

26

Is There a Reversible State?

54:13 • 2min

27

The Variational Empowerment Paper

56:14 • 3min

28

Language Models

59:39 • 2min

29

Is Language a Substrat of Thought?

01:01:39 • 2min

30

The Grid Sentence Book - Is That Really Influential?

01:03:37 • 2min

31

What Are the Bottlenecks in the Field of Research?

01:06:07 • 3min

32

Scaling Robotics

01:09:00 • 2min

33

Do You Have Any Controversial Research Opinions?

01:11:23 • 2min

34

Is There a Master Objective in a Language Model?

01:13:28 • 2min

35

I Agree With the Random Objective, but It Doesn't Write Objective.

01:15:16 • 2min

36

Is It Important for Intelligence?

01:16:47 • 2min

37

Sensory Motor Feedback

01:18:28 • 3min

38

Is There a Difference Between Learning and Learning?

01:21:19 • 2min

39

Is There a Better Structure for Research?

01:23:05 • 2min

40

What Are the Differences Between Team Meetings and One-on-One Advisor Meetings?

01:25:33 • 2min

41

Are There Any Mistakes That You've Made as a Researcher?

01:27:53 • 2min

42

Learning a Gradient Signal in a Neural Network

01:30:07 • 2min

43

Is Scaling a Blocker to General Agents?

01:31:47 • 2min

44

Is Scaling Up Models Better Than Humans?

01:33:50 • 3min

45

Is Burning Man a Good Example?

01:36:34 • 2min