Strong Baseline Defenses Against Clean-Label Poisoning Attacks

09/29/2019
by   Neal Gupta, et al.
0

Targeted clean-label poisoning is a type of adversarial attack on machine learning systems where the adversary injects a few correctly-labeled, minimally-perturbed samples into the training data thus causing the deployed model to misclassify a particular test sample during inference. Although defenses have been proposed for general poisoning attacks (those which aim to reduce overall test accuracy), no reliable defense for clean-label attacks has been demonstrated, despite the attacks' effectiveness and their realistic use cases. In this work, we propose a set of simple, yet highly-effective defenses against these attacks. We test our proposed approach against two recently published clean-label poisoning attacks, both of which use the CIFAR-10 dataset. After reproducing their experiments, we demonstrate that our defenses are able to detect over 99 them without any compromise on model performance. Our simple defenses show that current clean-label poisoning attack strategies can be annulled, and serve as strong but simple-to-implement baseline defense for which to test future clean-label poisoning attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

Adversarial Clean Label Backdoor Attacks and Defenses on Text Classification Systems

Clean-label (CL) attack is a form of data poisoning attack where an adve...
research
10/13/2021

Traceback of Data Poisoning Attacks in Neural Networks

In adversarial machine learning, new defenses against attacks on deep le...
research
06/14/2022

Turning a Curse Into a Blessing: Enabling Clean-Data-Free Defenses by Model Inversion

It is becoming increasingly common to utilize pre-trained models provide...
research
03/07/2022

Low-Loss Subspace Compression for Clean Gains against Multi-Agent Backdoor Attacks

Recent exploration of the multi-agent backdoor attack demonstrated the b...
research
04/30/2023

Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks

Adversarial training (AT) is a robust learning algorithm that can defend...
research
12/21/2022

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

We introduce camouflaged data poisoning attacks, a new attack vector tha...
research
08/09/2021

Classification Auto-Encoder based Detector against Diverse Data Poisoning Attacks

Poisoning attacks are a category of adversarial machine learning threats...

Please sign up or login with your details

Forgot password? Click here to reset