
38.5 - Adrià Garriga-Alonso on Detecting AI Scheming
AXRP - the AI X-risk Research Podcast
Decoding AI Scheming
This chapter examines the nuanced behaviors of AI as it adapts to align with human intentions and the implications of such reactions. It addresses the challenges in detecting deceptive actions and understanding how AI's long-term goals can influence its interactions with developers and users.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.