The Nonlinear Library cover image

AF - AI Control: Improving Safety Despite Intentional Subversion by Buck Shlegeris

The Nonlinear Library

00:00

Addressing High-Stakes AI Safety Challenges

Exploring the crucial role of explicit assumptions in AI safety research, focusing on scalable oversight projects and the distinction between low and high-stakes contexts. Emphasizing the risks posed by untrusted models undermining safety measures intentionally, and the need for specialized techniques in high-stakes settings.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app