MetaPoison: Practical General-purpose Clean-label Data Poisoning

by   W. Ronny Huang, et al.

Data poisoning–the process by which an attacker takes control of a model by making imperceptible changes to a subset of the training data–is an emerging threat in the context of neural networks. Existing attacks for data poisoning have relied on hand-crafted heuristics. Instead, we pose crafting poisons more generally as a bi-level optimization problem, where the inner level corresponds to training a network on a poisoned dataset and the outer level corresponds to updating those poisons to achieve a desired behavior on the trained model. We then propose MetaPoison, a first-order method to solve this optimization quickly. MetaPoison is effective: it outperforms previous clean-label poisoning methods by a large margin under the same setting. MetaPoison is robust: its poisons transfer to a variety of victims with unknown hyperparameters and architectures. MetaPoison is also general-purpose, working not only in fine-tuning scenarios, but also for end-to-end training from scratch with remarkable success, e.g. causing a target image to be misclassified 90 time via manipulating just 1 achieve arbitrary adversary goals not previously possible–like using poisons of one class to make a target image don the label of another arbitrarily chosen class. Finally, MetaPoison works in the real-world. We demonstrate successful data poisoning of models trained on Google Cloud AutoML Vision. Code and premade poisons are provided at


page 6

page 7

page 12

page 13

page 17

page 18


Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

In this paper, we explore clean-label poisoning attacks on deep convolut...

Trainable Projected Gradient Method for Robust Fine-tuning

Recent studies on transfer learning have shown that selectively fine-tun...

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

Data Poisoning attacks involve an attacker modifying training data to ma...

Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Backdoor attack has emerged as a major security threat to deep neural ne...

Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch

As the curation of data for machine learning becomes increasingly automa...

Deep Manifold Traversal: Changing Labels with Convolutional Features

Many tasks in computer vision can be cast as a "label changing" problem,...

PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning

Contrastive learning pre-trains an image encoder using a large amount of...

Please sign up or login with your details

Forgot password? Click here to reset