

AI Control: Improving Safety Despite Intentional Subversion
8 snips Apr 7, 2024
The podcast discusses safeguarding AI systems from intentional subversion, exploring protocols for AI model integrity, enhancing safety through control strategies, ensuring trustworthy AI in high-stakes settings, and exploring code back-dooring for safety measures.
Chapters
Transcript
Episode notes