How Sensitive are Sensitivity-Based Explanations?

by   Chih-Kuan Yeh, et al.

We propose a simple objective evaluation measure for explanations of a complex black-box machine learning model. While most such model explanations have largely been evaluated via qualitative measures, such as how humans might qualitatively perceive the explanations, it is vital to also consider objective measures such as the one we propose in this paper. Our evaluation measure that we naturally call sensitivity is simple: it characterizes how an explanation changes as we vary the test input, and depending on how we measure these changes, and how we vary the input, we arrive at different notions of sensitivity. We also provide a calculus for deriving sensitivity of complex explanations in terms of that for simpler explanations, which thus allows an easy computation of sensitivities for yet to be proposed explanations. One advantage of an objective evaluation measure is that we can optimize the explanation with respect to the measure: we show that (1) any given explanation can be simply modified to improve its sensitivity with just a modest deviation from the original explanation, and (2) gradient based explanations of an adversarially trained network are less sensitive. Perhaps surprisingly, our experiments show that explanations optimized to have lower sensitivity can be more faithful to the model predictions.


page 12

page 20

page 22

page 23


Evaluating the overall sensitivity of saliency-based explanation methods

We address the need to generate faithful explanations of "black box" Dee...

Evaluating and Aggregating Feature-based Model Explanations

A feature-based model explanation denotes how much each input feature co...

Reliable Local Explanations for Machine Listening

One way to analyse the behaviour of machine learning models is through l...

Coalitional Bayesian Autoencoders – Towards explainable unsupervised deep learning

This paper aims to improve the explainability of Autoencoder's (AE) pred...

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Explaining the output of a complicated machine learning model like a dee...

When less is more: Simplifying inputs aids neural network understanding

How do neural network image classifiers respond to simpler and simpler i...

High Dimensional Model Explanations: an Axiomatic Approach

Complex black-box machine learning models are regularly used in critical...

Please sign up or login with your details

Forgot password? Click here to reset