Explaining Image Classifiers by Adaptive Dropout and Generative In-filling

by   Chun-Hao Chang, et al.

Explanations of black-box classifiers often rely on saliency maps, which score the relevance of each input dimension to the resulting classification. Recent approaches compute saliency by optimizing regions of the input that maximally change the classification outcome when replaced by a reference value. These reference values are based on ad-hoc heuristics such as the input mean. In this work we marginalize out masked regions of the input, conditioning a generative model on the rest of the image. Our model-agnostic method produces realistic explanations, generating plausible inputs that would have caused the model to classify differently. When applied to image classification, our method produces more compact and relevant explanations, with fewer artifacts compared to previous methods.


page 2

page 5

page 6

page 7

page 8

page 10

page 11


Generative causal explanations of black-box classifiers

We develop a method for generating causal post-hoc explanations of black...

What You See is What You Classify: Black Box Attributions

An important step towards explaining deep image classifiers lies in the ...

A study on the Interpretability of Neural Retrieval Models using DeepSHAP

A recent trend in IR has been the usage of neural networks to learn retr...

Effect of Superpixel Aggregation on Explanations in LIME – A Case Study with Biological Data

End-to-end learning with deep neural networks, such as convolutional neu...

Explanation by Progressive Exaggeration

As machine learning methods see greater adoption and implementation in h...

Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods

A popular approach to unveiling the black box of neural NLP models is to...

Please sign up or login with your details

Forgot password? Click here to reset