AXRP - the AI X-risk Research Podcast cover image

2 - Learning Human Biases with Rohin Shah

AXRP - the AI X-risk Research Podcast

CHAPTER

How to Model Human Bias in a Logical Environment

i think in this case it was actually that the transition dynamics were the same across all the environments, and the value crition metork was allowed to learn a warped version of them. This is more like, when we came up with our model of what the human human planer was doing, we put into it this, like, incorrect model of how the world works. So that is still a difference, but it isn'tlike we learned a planer that gets the correct transition dynamics and then works them a like that. Ah, or possibly i just said a wrong thing earlier. I mean, imean takin about, you've got af of some something ther,. ye?

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner