
Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning
Generally Intelligent
The Fall Option of the Paper
The robot learned to walk like side was in many different ways and different gates. The researchers were able to put together results saying, okay, it's walking sideways. It's not in our control, but can still be used for navigation. So I guess like walking sideways makes it like the mutual information like higher. Exactly. That's really funny. Oh, crazy. And this is really interesting because usually when our researchers like they create a reward function, so they like have their own biases and how they create their reward functions. But now that you're actually doing unsupervised, there's no virus saying, wait, you have to walk a certain way. And it just ended up happening
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.