Interpreting Adversarially Trained Convolutional Neural Networks

05/23/2019
by   Tianyuan Zhang, et al.
5

We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation. We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and images under different transformations. The comparison could visually show that the prediction of the two types of CNNs is sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 8

page 14

page 15

research
11/29/2018

ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness

Convolutional Neural Networks (CNNs) are commonly thought to recognise o...
research
08/31/2020

Shape Defense

Humans rely heavily on shape information to recognize objects. Conversel...
research
12/12/2020

Assessing The Importance Of Colours For CNNs In Object Recognition

Humans rely heavily on shapes as a primary cue for object recognition. A...
research
05/22/2022

CNNs are Myopic

We claim that Convolutional Neural Networks (CNNs) learn to classify ima...
research
07/12/2022

Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information

A counter-intuitive property of convolutional neural networks (CNNs) is ...
research
06/16/2020

Intriguing generalization and simplicity of adversarially trained neural networks

Adversarial training has been the topic of dozens of studies and a leadi...
research
05/21/2020

A Neural Network Looks at Leonardo's(?) Salvator Mundi

We use convolutional neural networks (CNNs) to analyze authorship questi...

Please sign up or login with your details

Forgot password? Click here to reset