Future of Life Institute Podcast cover image

Dan Hendrycks on Catastrophic AI Risks

Future of Life Institute Podcast

00:00

Deceptive Behavior in Reinforcement Learning

This chapter explores the drive for deception in reinforcement learning and its implications in various contexts, including agents with misaligned goals and chatbots. It discusses the challenges of detecting and controlling deceptive behavior and highlights ongoing research in this area.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app