Generally Intelligent cover image

Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning

Generally Intelligent

CHAPTER

How Do You Best Use Human Supervision for Learning and Specification?

When you think about this problem, it seems like the question of what should the reward be is also a question of what is the environment and what are the tasks. How do you best use human supervision both for learning and for specifying is something that's very interesting as well. My own views are not very rigid onlike what will win out here. It's more likely that things that are easy to specify and like scale well.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner