AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Using a Gplonet to Model Uncertainty
The g phlonet can sample a not just what you should be doing in order to acquire information, but also a potential reward function. We don't actually have a knowledge ofo you know what's going to be the rewards we begin to get in the world. Classical el is going as you set to the a expected value and try to maximise that. The gief liner approaches, trying to acquire as knowledge as possible about the underlying reward function. So your model, the g plonet, is modelling the uncertainty,. Then it can use it as a a reward for the policy that is going to do action in the real world.