AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Training Data Poisoning
We also do some interesting stuff on exploring data poisoning. Data poisoning is what happens if someone can control a small fraction of the data that you're training machine learning model on. And it turns out to poison a multi model contrastive self supervised learning model, the answer is something like 1 in 10,000 images need to be malicious. We looked at this on CIFAR 10, but we don't have any. That was one where you trained your own models on. So training diffusion models is actually very slow, very expensive and so the best that we can do for running a bunch of these privacy experiments is to try training models herself.