LessWrong (Curated & Popular) cover image

“Will Any Old Crap Cause Emergent Misalignment?” by J Bostock

LessWrong (Curated & Popular)

00:00

Exploring Emergent Misalignment in AI Systems

This chapter explores the concept of emergent misalignment in machine learning models, examining how benign outputs can lead to detrimental actions. It highlights significant research on the causes and complexities of misalignment, emphasizing the challenges of distinguishing between narrow and emergent misalignment in AI systems.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app