Enhancing Targeted Attack Transferability via Diversified Weight Pruning

08/18/2022
by   Hung-Jui Wang, et al.
0

Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images, forcing neural network models to produce specific incorrect outputs. With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker. Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples. However, existing methods fall short under the more challenging scenario of creating targeted attacks transferable among distinct models. In this work, we propose Diversified Weight Pruning (DWP) to further enhance the ensemble-based methods by leveraging the weight pruning method commonly used in model compression. Specifically, we obtain multiple diverse models by a random weight pruning method. These models preserve similar accuracies and can serve as additional models for ensemble-based methods, yielding stronger transferable targeted attacks. Experiments on ImageNet-Compatible Dataset under the more challenging scenarios are provided: transferring to distinct architectures and to adversarially trained models. The results show that our proposed DWP improves the targeted attack success rates with up to 4.1 combination of state-of-the-art methods, respectively

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2023

Enhancing Adversarial Attacks: The Similar Target Method

Deep neural networks are vulnerable to adversarial examples, posing a th...
research
07/05/2021

Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks

Transfer-based adversarial attacks can effectively evaluate model robust...
research
10/09/2022

Pruning Adversarially Robust Neural Networks without Adversarial Examples

Adversarial pruning compresses models while preserving robustness. Curre...
research
12/04/2019

Towards Robust Image Classification Using Sequential Attention Models

In this paper we propose to augment a modern neural-network architecture...
research
03/17/2022

Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input

The transferability of adversarial examples allows the deception on blac...
research
03/07/2023

Logit Margin Matters: Improving Transferable Targeted Adversarial Attack by Logit Calibration

Previous works have extensively studied the transferability of adversari...
research
11/05/2018

On the Transferability of Adversarial Examples Against CNN-Based Image Forensics

Recent studies have shown that Convolutional Neural Networks (CNN) are r...

Please sign up or login with your details

Forgot password? Click here to reset