AXRP - the AI X-risk Research Podcast cover image

33 - RLHF Problems with Scott Emmons

AXRP - the AI X-risk Research Podcast

00:00

Exploring Costly Signaling and Overjustification in Economics and AI Algorithms

The chapter delves into the concept of costly signaling in economics and overjustification effect in relation to agent benefit and human's reward. It discusses the nuances of paying costs for signaling in economics versus AI algorithms, highlighting convergences and distinctions. The speakers explore how overjustification plays a role in agent behavior, emphasizing its impact on being well-informed and optimizing objectives in the context of RLHF (Reinforcement Learning from Human Feedback).

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app