
Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning
Generally Intelligent
Unsupervised Learning - The Steeper Game
The more inductive biases that you add, probably better behaviors you'll get out of it. So we tried to encode that bias and there was actually a very, very nice reason why you wanted the behaviors to be predictable. It allows us to do model based R on top. Once you learn those behaviors and you can predict their consequences, you can in a zero shot, marriage is directly used the behaviors for any task downstream.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.