AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Reward Functions in Deep Reactor Learning
In reinforcement learning, typical formulation is that an agent is requested to optimize a reward function. For example, scoring the game or something task completion for a robot. I found you saying, I actually don't like reward functions. What do you want to, what do you want instead? You've done a lot of the leading work in deep reinforcement learning over the past several years.