AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Preserve the Ability to Do Random Things?
How is it still able to have a performant policy while satisfying these while being penalized for no is ability? So theye, i'd wondered about this. And you could consider an environment whose dynamics are mild by a regular tree. Like, the agent's always moving. It always has to close off some options as it moves down this tree. Thiss actually why i wrote the second paper, a to try and understand what the structure of the world makes this so so yet but we'll talk about that a litle bit later.