AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Model Human Bias in a Logical Environment
i think in this case it was actually that the transition dynamics were the same across all the environments, and the value crition metork was allowed to learn a warped version of them. This is more like, when we came up with our model of what the human human planer was doing, we put into it this, like, incorrect model of how the world works. So that is still a difference, but it isn'tlike we learned a planer that gets the correct transition dynamics and then works them a like that. Ah, or possibly i just said a wrong thing earlier. I mean, imean takin about, you've got af of some something ther,. ye?