The Nonlinear Library cover image

AF - Analysing Adversarial Attacks with Linear Probing by Yoann Poupart

The Nonlinear Library

00:00

Exploration of Binary Concepts and Adversarial Attacks in Machine Learning

The chapter covers binary concepts, optimization processes, logistic regression with L2 penalty, dataset creation with fruit and vegetable images, training classifiers, creating activation datasets, probe and classifier training, and validating probes. It also discusses how linear probing can identify adversarial attacks, with experiments using lemon, tomato, and banana images to study changes in concept representation across different layers.

Play episode from 03:52
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app