The Nonlinear Library

AF - Analysing Adversarial Attacks with Linear Probing by Yoann Poupart

Jun 17, 2024
Researcher Yoann Poupart discusses using linear probing to detect adversarial attacks in machine learning models. They explore modifications in concept probes in later layers to identify attacks, showcasing experiments with fruit images. Future perspectives include addressing interpretability limitations and potential biases, emphasizing the importance of linear probes in defending against adversarial attacks.
Ask episode
Chapters
Transcript
Episode notes