AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Meta Learning Loss Function
The idea is that we wanted to be able to use a video of a human instead of a tele-operated demonstration to adapt. So the inner loop uses the human demonstrations while the outer loop uses the tele-operated demonstrations. We also learned this loss function which is also a neural that work that takes us input the activations and outputs the scalar value. And then we took the gradients of this loss function and used those gradients to update the parameters of the policy.