

Robustness, Detectability, and Data Privacy in AI // Vinu Sankar Sadasivan // #289
Feb 7, 2025
Vinu Sankar Sadasivan, a PhD candidate at the University of Maryland and Student Researcher at Google DeepMind, dives into the crucial themes of AI robustness and security. He discusses the challenges of jailbreaking multimodal models and explores innovative watermarking techniques for identifying AI-generated content. Vinu highlights the complexities of red teaming practices and automated vulnerability exploitation, showcasing the ongoing battle between AI manipulators and defenders. This engaging session sheds light on the future of safe AI applications across various fields.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7
Intro
00:00 • 1min
Challenges of Watermarking and Detecting AI-Generated Text
01:21 • 6min
Navigating AI Text Detection and Concealment
06:57 • 26min
The Evolution of Red Teaming in AI: From Manual Prompts to Automated Techniques
33:10 • 5min
Advancements in AI Vulnerability Exploitation
37:54 • 6min
Analyzing Inputs and Outputs in AI Systems
44:04 • 2min
Red Teaming AI: Strategies and Vulnerabilities
45:47 • 7min