3min chapter

AXRP - the AI X-risk Research Podcast cover image

2 - Learning Human Biases with Rohin Shah

AXRP - the AI X-risk Research Podcast

CHAPTER

Reward Functions - Is That Really Imperative?

In the reinforcement learning paradime, you get a planning module and a reward that together produce the right behavior. But if you'd then try to interpret the reward as an arbitory function,  you get like, randomis behavior. So there are two versions of this that we consider. One is unrealistic in a practice, but serves as a good intuition. There's a second version where, instead of assuming that we have access to some reward functions, we assume that the human is close to optimal. This means their planning module is close to what would make optimal decisions. And since he started out near the optim started out as being optimal, you're probably not going to go all the

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode