AXRP - the AI X-risk Research Podcast cover image

33 - RLHF Problems with Scott Emmons

AXRP - the AI X-risk Research Podcast

00:00

Exploring Challenges in Reinforcement Learning from Human Feedback

This chapter examines the complexities and potential failure modes in reinforcement learning from human feedback, emphasizing the importance of addressing these challenges as AI systems advance. It discusses issues such as human beliefs, lack of programming expertise, safety concerns, and the depth of research in the field.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app