The Different Types of Attacks on Deep Learning Models

There are actually many types of attacks you can do let's consider that the model is learned on data like chat GPT it's a trained model but also it can learn on new examples and it's constantly being improved so by asking specific questions you can like program the the model to the way you want to behave. Usually it involves having your own model like a surrogate model and testing what can you do on your own model and then having attacks on the actual model. For example in my paper there was an example of a banana and there was an image of banana and the model would correctly detect that it's a banana, but when you put a little sticker specific colors pixel values then the model starts

Play episode from 22:00

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app