Backdoor Cleansing with Unlabeled Data

11/22/2022
by   Lu Pang, et al.
0

Due to the increasing computational demand of Deep Neural Networks (DNNs), companies and organizations have begun to outsource the training process. However, the externally trained DNNs can potentially be backdoor attacked. It is crucial to defend against such attacks, i.e., to postprocess a suspicious model so that its backdoor behavior is mitigated while its normal prediction power on clean inputs remain uncompromised. To remove the abnormal backdoor behavior, existing methods mostly rely on additional labeled clean samples. However, such requirement may be unrealistic as the training data are often unavailable to end users. In this paper, we investigate the possibility of circumventing such barrier. We propose a novel defense method that does not require training labels. Through a carefully designed layer-wise weight re-initialization and knowledge distillation, our method can effectively cleanse backdoor behaviors of a suspicious network with negligible compromise in its normal behavior. In experiments, we show that our method, trained without labels, is on-par with state-of-the-art defense methods trained using labels. We also observe promising defense results even on out-of-distribution data. This makes our method very practical.

READ FULL TEXT
research
01/15/2021

Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks

Deep neural networks (DNNs) are known vulnerable to backdoor attacks, a ...
research
05/24/2023

Reconstructive Neuron Pruning for Backdoor Defense

Deep neural networks (DNNs) have been found to be vulnerable to backdoor...
research
04/08/2022

Backdoor Attack against NLP models with Robustness-Aware Perturbation defense

Backdoor attack intends to embed hidden backdoor into deep neural networ...
research
08/23/2023

BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

We present a novel defense, against backdoor attacks on Deep Neural Netw...
research
06/29/2023

Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Recent studies have demonstrated the susceptibility of deep neural netwo...
research
10/12/2022

Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork

Deep neural networks (DNNs) are vulnerable to backdoor attacks. Previous...
research
02/19/2020

NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs

This paper proposes a novel two-stage defense (NNoculation) against back...

Please sign up or login with your details

Forgot password? Click here to reset