Advancements in AI Vulnerability Exploitation

This chapter explores the complexities of automated attacks on language models, particularly focusing on the GCG algorithm and how adversaries can manipulate inputs to cause malfunctions. It also discusses transferability of adversarial attacks between models and the defensive strategies being employed, such as automated red teaming and the use of classifiers.

Play episode from 37:54

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Vinu Sankar Sadasivan is a CS PhD ... Currently, I am working as a full-time Student Researcher at Google DeepMind on jailbreaking multimodal AI models.

Robustness, Detectability, and Data Privacy in AI // MLOps Podcast #289 with Vinu Sankar Sadasivan, Student Researcher at Google DeepMind.

// Abstract

Recent rapid advancements in Artificial Intelligence (AI) have made it widely applicable across various domains, from autonomous systems to multimodal content generation. However, these models remain susceptible to significant security and safety vulnerabilities. Such weaknesses can enable attackers to jailbreak systems, allowing them to perform harmful tasks or leak sensitive information. As AI becomes increasingly integrated into critical applications like autonomous robotics and healthcare, the importance of ensuring AI safety is growing. Understanding the vulnerabilities in today’s AI systems is crucial to addressing these concerns.

// Bio

Vinu Sankar Sadasivan is a final-year Computer Science PhD candidate at The University of Maryland, College Park, advised by Prof. Soheil Feizi. His research focuses on Security and Privacy in AI, with a particular emphasis on AI robustness, detectability, and user privacy. Currently, Vinu is a full-time Student Researcher at Google DeepMind, working on jailbreaking multimodal AI models. Previously, Vinu was a Research Scientist intern at Meta FAIR in Paris, where he worked on AI watermarking.

Vinu is a recipient of the 2023 Kulkarni Fellowship and has earned several distinctions, including the prestigious Director’s Silver Medal. He completed a Bachelor’s degree in Computer Science & Engineering at IIT Gandhinagar in 2020. Prior to their PhD, Vinu gained research experience as a Junior Research Fellow in the Data Science Lab at IIT Gandhinagar and through internships at Caltech, Microsoft Research India, and IISc.

// MLOps Swag/Merch

https://shop.mlops.community/

// Related Links

Website: https://vinusankars.github.io/

--------------- ✌️Connect With Us ✌️ -------------

Join our Slack community: https://go.mlops.community/slack

Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/

Connect with Richard on LinkedIn: https://www.linkedin.com/in/vinusankars/

Timestamps:

[00:00] Vinu's preferred coffee

[00:31] Takeaways