The Nonlinear Library cover image

AF - AI Control: Improving Safety Despite Intentional Subversion by Buck Shlegeris

The Nonlinear Library

CHAPTER

Addressing High-Stakes AI Safety Challenges

Exploring the crucial role of explicit assumptions in AI safety research, focusing on scalable oversight projects and the distinction between low and high-stakes contexts. Emphasizing the risks posed by untrusted models undermining safety measures intentionally, and the need for specialized techniques in high-stakes settings.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner