"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis cover image

Universal Jailbreaks with Zico Kolter, Andy Zou, and Asher Trockman

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

00:00

Defending Against Adversarial Attacks: Outside and Inside the Model

Defending against adversarial attacks is a challenge in deep learning. There are two approaches: limiting attack surface and making the model more robust. However, making the model smoother reduces performance. Current defenses are not highly effective. There is a fundamental trade-off between robustness and performance. Deploying non-robust models may become unwise due to risks. There is still much to learn about creating robust defenses. Exploiting the discrete nature of the task may offer hope. Overall, it is early in the research landscape of LLMs.

Play episode from 46:01
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app