Immune Defense: A Novel Adversarial Defense Mechanism for Preventing the Generation of Adversarial Examples

by   Jinwei Wang, et al.

The vulnerability of Deep Neural Networks (DNNs) to adversarial examples has been confirmed. Existing adversarial defenses primarily aim at preventing adversarial examples from attacking DNNs successfully, rather than preventing their generation. If the generation of adversarial examples is unregulated, images within reach are no longer secure and pose a threat to non-robust DNNs. Although gradient obfuscation attempts to address this issue, it has been shown to be circumventable. Therefore, we propose a novel adversarial defense mechanism, which is referred to as immune defense and is the example-based pre-defense. This mechanism applies carefully designed quasi-imperceptible perturbations to the raw images to prevent the generation of adversarial examples for the raw images, and thereby protecting both images and DNNs. These perturbed images are referred to as Immune Examples (IEs). In the white-box immune defense, we provide a gradient-based and an optimization-based approach, respectively. Additionally, the more complex black-box immune defense is taken into consideration. We propose Masked Gradient Sign Descent (MGSD) to reduce approximation error and stabilize the update to improve the transferability of IEs and thereby ensure their effectiveness against black-box adversarial attacks. The experimental results demonstrate that the optimization-based approach has superior performance and better visual quality in white-box immune defense. In contrast, the gradient-based approach has stronger transferability and the proposed MGSD significantly improve the transferability of baselines.


page 1

page 3


Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks

Deep neural networks are vulnerable to adversarial examples, which can m...

Improved Adversarial Robustness via Logit Regularization Methods

While great progress has been made at making neural networks effective a...

IWA: Integrated Gradient based White-box Attacks for Fooling Deep Neural Networks

The widespread application of deep neural network (DNN) techniques is be...

Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks

Deep neural networks have shown to be very vulnerable to adversarial exa...

Boosting Gradient for White-Box Adversarial Attacks

Deep neural networks (DNNs) are playing key roles in various artificial ...

Defending Against Adversarial Attacks Using Random Forests

As deep neural networks (DNNs) have become increasingly important and po...

Benchmarking adversarial attacks and defenses for time-series data

The adversarial vulnerability of deep networks has spurred the interest ...

Please sign up or login with your details

Forgot password? Click here to reset