
Episode 22: Archit Sharma, Stanford, on unsupervised and autonomous reinforcement learning
Generally Intelligent
00:00
Unsupervised Learning - The Steeper Game
The more inductive biases that you add, probably better behaviors you'll get out of it. So we tried to encode that bias and there was actually a very, very nice reason why you wanted the behaviors to be predictable. It allows us to do model based R on top. Once you learn those behaviors and you can predict their consequences, you can in a zero shot, marriage is directly used the behaviors for any task downstream.
Transcript
Play full episode