Robert Wright's Nonzero cover image

Two Visions of AI Apocalypse (Robert Wright & David Krueger)

Robert Wright's Nonzero

00:00

The Risk of Misgeneralization and Jailbreaking

David outlines how models generalize poorly in unexpected ways and how adversarial prompts expose fragile alignment.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app