Future of Life Institute Podcast cover image

Special: Defeating AI Defenses (with Nicholas Carlini and Nathan Labenz)

Future of Life Institute Podcast

CHAPTER

Navigating AI Safety and Security Challenges

This chapter explores the intricate relationship between AI model robustness and corrigibility, highlighting experimental findings on AI prioritizing harmlessness while facing harmful requests. It discusses the complexities of securing language models against vulnerabilities and the implications of open-source AI technology, emphasizing the importance of informed policy-making. The speakers call for collaborative efforts in research to better understand AI risks and improve safety strategies.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner