Backdoor Attacks on Self-Supervised Learning

05/21/2021
by   Aniruddha Saha, et al.
12

Large-scale unlabeled data has allowed recent progress in self-supervised learning methods that learn rich visual representations. State-of-the-art self-supervised methods for learning representations from images (MoCo and BYOL) use an inductive bias that different augmentations (e.g. random crops) of an image should produce similar embeddings. We show that such methods are vulnerable to backdoor attacks where an attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images. The model performance is good on clean test images but the attacker can manipulate the decision of the model by showing the trigger at test time. Backdoor attacks have been studied extensively in supervised learning and to the best of our knowledge, we are the first to study them for self-supervised learning. Backdoor attacks are more practical in self-supervised learning since the unlabeled data is large and as a result, an inspection of the data to avoid the presence of poisoned data is prohibitive. We show that in our targeted attack, the attacker can produce many false positives for the target category by using the trigger at test time. We also propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack. Our code is available here: https://github.com/UMBCvision/SSL-Backdoor .

READ FULL TEXT

page 2

page 7

page 10

research
04/04/2023

Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

Recently, self-supervised learning (SSL) was shown to be vulnerable to p...
research
06/16/2022

Backdoor Attacks on Vision Transformers

Vision Transformers (ViT) have recently demonstrated exemplary performan...
research
02/22/2023

ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms

Backdoor data detection is traditionally studied in an end-to-end superv...
research
03/18/2023

Exploring Expression-related Self-supervised Learning for Affective Behaviour Analysis

This paper explores an expression-related self-supervised learning (SSL)...
research
10/13/2022

Demystifying Self-supervised Trojan Attacks

As an emerging machine learning paradigm, self-supervised learning (SSL)...
research
08/06/2022

Constrained self-supervised method with temporal ensembling for fiber bundle detection on anatomic tracing data

Anatomic tracing data provides detailed information on brain circuitry e...
research
01/20/2023

Towards Understanding How Self-training Tolerates Data Backdoor Poisoning

Recent studies on backdoor attacks in model training have shown that pol...

Please sign up or login with your details

Forgot password? Click here to reset