Future of Life Institute Podcast cover image

Special: Defeating AI Defenses (with Nicholas Carlini and Nathan Labenz)

Future of Life Institute Podcast

00:00

Navigating AI Safety and Security Challenges

This chapter explores the intricate relationship between AI model robustness and corrigibility, highlighting experimental findings on AI prioritizing harmlessness while facing harmful requests. It discusses the complexities of securing language models against vulnerabilities and the implications of open-source AI technology, emphasizing the importance of informed policy-making. The speakers call for collaborative efforts in research to better understand AI risks and improve safety strategies.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app