Threat Model-Agnostic Adversarial Defense using Diffusion Models

by   Tsachi Blau, et al.

Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks. Following the discovery of this vulnerability in real-world imaging and vision applications, the associated safety concerns have attracted vast research attention, and many defense techniques have been developed. Most of these defense methods rely on adversarial training (AT) – training the classification network on images perturbed according to a specific threat model, which defines the magnitude of the allowed modification. Although AT leads to promising results, training on a specific threat model fails to generalize to other types of perturbations. A different approach utilizes a preprocessing step to remove the adversarial perturbation from the attacked image. In this work, we follow the latter path and aim to develop a technique that leads to robust classifiers across various realizations of threat models. To this end, we harness the recent advances in stochastic generative modeling, and means to leverage these for sampling from conditional distributions. Our defense relies on an addition of Gaussian i.i.d noise to the attacked image, followed by a pretrained diffusion process – an architecture that performs a stochastic iterative process over a denoising network, yielding a high perceptual quality denoised outcome. The obtained robustness with this stochastic preprocessing step is validated through extensive experiments on the CIFAR-10 dataset, showing that our method outperforms the leading defense methods under various threat models.


page 7

page 12


Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

We present adversarial attacks and defenses for the perceptual adversari...

Defending Neural Backdoors via Generative Distribution Modeling

Neural backdoor attack is emerging as a severe security threat to deep l...

Adversarial Purification through Representation Disentanglement

Deep learning models are vulnerable to adversarial examples and make inc...

What Doesn't Kill You Makes You Robust(er): Adversarial Training against Poisons and Backdoors

Data poisoning is a threat model in which a malicious actor tampers with...

Proactive Allocation as Defense for Malicious Co-residency in Sliced 5G Core Networks

Malicious co-residency in virtualized networks poses a real threat. The ...

Toward a Mathematical Vulnerability Propagation and Defense Model in Smart Grid Networks

For reducing threat propagation within an inter-connected network, it is...

Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression

Perturbative availability poisoning (PAP) adds small changes to images t...

Please sign up or login with your details

Forgot password? Click here to reset