On the Sensitivity of Adversarial Robustness to Input Data Distributions

02/22/2019
by   Gavin Weiguang Ding, et al.
0

Neural networks are vulnerable to small adversarial perturbations. Existing literature largely focused on understanding and mitigating the vulnerability of learned models. In this paper, we demonstrate an intriguing phenomenon about the most popular robust training method in the literature, adversarial training: Adversarial robustness, unlike clean accuracy, is sensitive to the input data distribution. Even a semantics-preserving transformations on the input data distribution can cause a significantly different robustness for the adversarial trained model that is both trained and evaluated on the new distribution. Our discovery of such sensitivity on data distribution is based on a study which disentangles the behaviors of clean accuracy and robust accuracy of the Bayes classifier. Empirical investigations further confirm our finding. We construct semantically-identical variants for MNIST and CIFAR10 respectively, and show that standardly trained models achieve comparable clean accuracies on them, but adversarially trained models achieve significantly different robustness accuracies. This counter-intuitive phenomenon indicates that input data distribution alone can affect the adversarial robustness of trained neural networks, not necessarily the tasks themselves. Lastly, we discuss the practical implications on evaluating adversarial robustness, and make initial attempts to understand this complex phenomenon.

READ FULL TEXT

page 6

page 8

research
12/24/2022

Frequency Regularization for Improving Adversarial Robustness

Deep neural networks are incredibly vulnerable to crafted, human-imperce...
research
07/04/2022

Removing Batch Normalization Boosts Adversarial Training

Adversarial training (AT) defends deep neural networks against adversari...
research
03/16/2018

Vulnerability of Deep Learning

The Renormalisation Group (RG) provides a framework in which it is possi...
research
10/21/2020

Precise Statistical Analysis of Classification Accuracies for Adversarial Training

Despite the wide empirical success of modern machine learning algorithms...
research
08/25/2021

Bridged Adversarial Training

Adversarial robustness is considered as a required property of deep neur...
research
09/03/2021

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

Since training a large-scale backdoored model from scratch requires a la...
research
01/09/2018

Adversarial Spheres

State of the art computer vision models have been shown to be vulnerable...

Please sign up or login with your details

Forgot password? Click here to reset