AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Reward Reinforcement Learning (RL)
In your book, you talk about finding the right mix and match of reward functions. I think it's useful to think about what should we actually optimize for, right? You know, kind of what are the objectives in the world in which we make up to win, win, win. That's the objective. In RL, that's all in the world because maybe that's a state which I can observe and so on. This is very useful because it tells us that when we're being efficacious and when we're not, this is really important. We also need to learn things intrinsically like we need to control our own mental health. And similarly, policies must be able to teach people