Generalizing Adversarial Explanations with Grad-CAM

04/11/2022
by   Tanmay Chakraborty, et al.
0

Gradient-weighted Class Activation Mapping (Grad- CAM), is an example-based explanation method that provides a gradient activation heat map as an explanation for Convolution Neural Network (CNN) models. The drawback of this method is that it cannot be used to generalize CNN behaviour. In this paper, we present a novel method that extends Grad-CAM from example-based explanations to a method for explaining global model behaviour. This is achieved by introducing two new metrics, (i) Mean Observed Dissimilarity (MOD) and (ii) Variation in Dissimilarity (VID), for model generalization. These metrics are computed by comparing a Normalized Inverted Structural Similarity Index (NISSIM) metric of the Grad-CAM generated heatmap for samples from the original test set and samples from the adversarial test set. For our experiment, we study adversarial attacks on deep models such as VGG16, ResNet50, and ResNet101, and wide models such as InceptionNetv3 and XceptionNet using Fast Gradient Sign Method (FGSM). We then compute the metrics MOD and VID for the automatic face recognition (AFR) use case with the VGGFace2 dataset. We observe a consistent shift in the region highlighted in the Grad-CAM heatmap, reflecting its participation to the decision making, across all models under adversarial attacks. The proposed method can be used to understand adversarial attacks and explain the behaviour of black box CNN models for image analysis.

READ FULL TEXT

page 4

page 5

page 6

research
03/30/2022

Example-based Explanations with Adversarial Attacks for Respiratory Sound Analysis

Respiratory sound classification is an important tool for remote screeni...
research
06/14/2022

When adversarial attacks become interpretable counterfactual explanations

We argue that, when learning a 1-Lipschitz neural network with the dual ...
research
09/05/2022

"Is your explanation stable?": A Robustness Evaluation Framework for Feature Attribution

Understanding the decision process of neural networks is hard. One vital...
research
12/09/2021

Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers

Recently, Convolutional Neural Network (CNN) has achieved excellent perf...
research
09/23/2018

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

We improve the robustness of deep neural nets to adversarial attacks by ...
research
05/14/2020

Evolved Explainable Classifications for Lymph Node Metastases

A novel evolutionary approach for Explainable Artificial Intelligence is...

Please sign up or login with your details

Forgot password? Click here to reset