Revisiting model self-interpretability in a decision-theoretic way for binary medical image classification

by   Sourya Sengupta, et al.

Interpretability is highly desired for deep neural network-based classifiers, especially when addressing high-stake decisions in medical imaging. Commonly used post-hoc interpretability methods may not be always useful because different such methods can produce several plausible but different interpretations of a given model, leading to confusion about which one to choose. In this work, an inherently interpretable encoder-decoder model coupled with a single-layer fully connected network with unity weights is proposed for binary medical image classification problems. The feature extraction component of a trained black-box network for the same task is employed as the pre-trained encoder of the interpretable model. The model is trained to estimate the decision statistic of the given trained black-box deep binary classifier to maintain a similar accuracy. The decoder output represents a transformed version of the to-be-classified image that, when processed by the fixed fully connected layer, produces the same decision statistic value as the original classifier. This is accomplished by minimizing the mean squared error between the decision statistic values of the black-box model and encoder-decoder based model during training. The decoder output image is referred to as an equivalency map. Because the single-layer network is fully interpretable, the equivalency map provides a visualization of the transformed image features that contribute to the decision statistic value and, moreover, permits quantification of their relative contributions. Unlike the traditional post-hoc interpretability methods, the proposed method is inherently interpretable, quantitative, and fundamentally based on decision theory.


page 4

page 5

page 6

page 10

page 11

page 13

page 14

page 15


An Interpretable Loan Credit Evaluation Method Based on Rule Representation Learner

The interpretability of model has become one of the obstacles to its wid...

Explaining black-box text classifiers for disease-treatment information extraction

Deep neural networks and other intricate Artificial Intelligence (AI) mo...

Implicit Mixture of Interpretable Experts for Global and Local Interpretability

We investigate the feasibility of using mixtures of interpretable expert...

Coherent Concept-based Explanations in Medical Image and Its Application to Skin Lesion Diagnosis

Early detection of melanoma is crucial for preventing severe complicatio...

A Theory of Diagnostic Interpretation in Supervised Classification

Interpretable deep learning is a fundamental building block towards safe...

Interpreting and Correcting Medical Image Classification with PIP-Net

Part-prototype models are explainable-by-design image classifiers, and a...

Explanation by Progressive Exaggeration

As machine learning methods see greater adoption and implementation in h...

Please sign up or login with your details

Forgot password? Click here to reset