Decision Explanation and Feature Importance for Invertible Networks

by   Juntang Zhuang, et al.

Deep neural networks are vulnerable to adversarial attacks and hard to interpret because of their black-box nature. The recently proposed invertible network is able to accurately reconstruct the inputs to a layer from its outputs, thus has the potential to unravel the black-box model. An invertible network classifier can be viewed as a two-stage model: (1) invertible transformation from input space to the feature space; (2) a linear classifier in the feature space. We can determine the decision boundary of a linear classifier in the feature space; since the transform is invertible, we can invert the decision boundary from the feature space to the input space. Furthermore, we propose to determine the projection of a data point onto the decision boundary, and define explanation as the difference between data and its projection. Finally, we propose to locally approximate a neural network with its first-order Taylor expansion, and define feature importance using a local linear model. We provide the implementation of our method: <>.


Invertible Network for Classification and Biomarker Selection for ASD

Determining biomarkers for autism spectrum disorder (ASD) is crucial to ...

Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

We present an approach to explain the decisions of black box models for ...

Defining Locality for Surrogates in Post-hoc Interpretablity

Local surrogate models, to approximate the local decision boundary of a ...

Rethinking the Reverse-engineering of Trojan Triggers

Deep Neural Networks are vulnerable to Trojan (or backdoor) attacks. Rev...

An Adaptive Black-box Backdoor Detection Method for Deep Neural Networks

With the surge of Machine Learning (ML), An emerging amount of intellige...

DE-CROP: Data-efficient Certified Robustness for Pretrained Classifiers

Certified defense using randomized smoothing is a popular technique to p...

Explanation by Progressive Exaggeration

As machine learning methods see greater adoption and implementation in h...

Please sign up or login with your details

Forgot password? Click here to reset