MLOps.community  cover image

Robustness, Detectability, and Data Privacy in AI // Vinu Sankar Sadasivan // #289

MLOps.community

00:00

Advancements in AI Vulnerability Exploitation

This chapter explores the complexities of automated attacks on language models, particularly focusing on the GCG algorithm and how adversaries can manipulate inputs to cause malfunctions. It also discusses transferability of adversarial attacks between models and the defensive strategies being employed, such as automated red teaming and the use of classifiers.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app