LessWrong (Curated & Popular) cover image

"Precedents for the Unprecedented: Historical Analogies for Thirteen Artificial Superintelligence Risks" by James_Miller

LessWrong (Curated & Popular)

00:00

Selection for Deception

He explains how oversight can reward models that hide misalignment, using Volkswagen and Lance Armstrong examples.

Play episode from 01:30:40
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app