Trojans and Trojans: A Comparison

Deception is when a system develops for whatever reason, bad behavior that will be a response to some sort of anomalous input. And this is quite close to the definition of what a Trojan is. It's just like, oh, imagine that someone has access to your training data. What types of weird weaknesses or behaviors could they implant in the network? Think about just the internet, which has now become the training data for lots of state of the art models. People could just put stuff up on the internet in order to like control. Well, sorry, what systems trained on internet scale data might actually end up doing.

Play episode from 01:07:30

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app