AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Problem With Adversaries in Neural Nets
Our goal is to be reliable on any text that seems like kind of normal human text. That's a really neat problem. Yeah in a sense you're trying to find at least one case where adversarial examples don't work. It seems like you can kind of construct an adversarial example from the bottom up. You keep mutating it a little more so the classifier gets a little more wrong and a little morewrong. And then see which way pushes it towards the wrong label the most.