Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

by   Zeyu Qin, et al.

Deep neural networks (DNNs) have been shown to be vulnerable to adversarial examples, which can produce erroneous predictions by injecting imperceptible perturbations. In this work, we study the transferability of adversarial examples, which is significant due to its threat to real-world applications where model architecture or parameters are usually unknown. Many existing works reveal that the adversarial examples are likely to overfit the surrogate model that they are generated from, limiting its transfer attack performance against different target models. To mitigate the overfitting of the surrogate model, we propose a novel attack method, dubbed reverse adversarial perturbation (RAP). Specifically, instead of minimizing the loss of a single adversarial point, we advocate seeking adversarial example located at a region with unified low loss value, by injecting the worst-case perturbation (the reverse adversarial perturbation) for each step of the optimization procedure. The adversarial attack with RAP is formulated as a min-max bi-level optimization problem. By integrating RAP into the iterative process for attacks, our method can find more stable adversarial examples which are less sensitive to the changes of decision boundary, mitigating the overfitting of the surrogate model. Comprehensive experimental comparisons demonstrate that RAP can significantly boost adversarial transferability. Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability. When attacking a real-world image recognition system, Google Cloud Vision API, we obtain 22 over the compared method. Our codes are available at


page 1

page 2

page 3

page 4


StyLess: Boosting the Transferability of Adversarial Examples

Adversarial attacks can mislead deep neural networks (DNNs) by adding im...

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

One intriguing property of adversarial attacks is their "transferability...

Rethinking the Backward Propagation for Adversarial Transferability

Transfer-based attacks generate adversarial examples on the surrogate mo...

Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations

Transferable adversarial attacks optimize adversaries from a pretrained ...

Structure Matters: Towards Generating Transferable Adversarial Images

Recent works on adversarial examples for image classification focus on d...

Transferable Sparse Adversarial Attack

Deep neural networks have shown their vulnerability to adversarial attac...

Improving Transferability of Adversarial Examples on Face Recognition with Beneficial Perturbation Feature Augmentation

Face recognition (FR) models can be easily fooled by adversarial example...

Please sign up or login with your details

Forgot password? Click here to reset