Regularizing Black-box Models for Improved Interpretability

02/18/2019
by   Gregory Plumb, et al.
20

Most work on interpretability in machine learning has focused on designing either inherently interpretable models, that typically trade-off interpretability for accuracy, or post-hoc explanation systems, that lack guarantees about their explanation quality. We propose an alternative to these approaches by directly regularizing a black-box model for interpretability at training time. Our approach explicitly connects three key aspects of interpretable machine learning: the model's innate explainability, the explanation system used at test time, and the metrics that measure explanation quality. Our regularization results in substantial (up to orders of magnitude) improvement in terms of explanation fidelity and stability metrics across a range of datasets, models, and black-box explanation systems. Remarkably, our regularizers also slightly improve predictive accuracy on average across the nine datasets we consider. Further, we show that the benefits of our novel regularizers on explanation quality provably generalize to unseen test points.

READ FULL TEXT
research
05/31/2019

Regularizing Black-box Models for Improved Interpretability (HILL 2019 Version)

Most of the work on interpretable machine learning has focused on design...
research
06/17/2021

Interpretable Machine Learning Classifiers for Brain Tumour Survival Prediction

Prediction of survival in patients diagnosed with a brain tumour is chal...
research
07/09/2018

Supervised Local Modeling for Interpretability

Model interpretability is an increasingly important component of practic...
research
10/21/2019

Making Bayesian Predictive Models Interpretable: A Decision Theoretic Approach

A salient approach to interpretable machine learning is to restrict mode...
research
03/10/2020

Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data

Machine learning using behavioral and text data can result in highly acc...
research
05/13/2022

Comparison of attention models and post-hoc explanation methods for embryo stage identification: a case study

An important limitation to the development of AI-based solutions for In ...

Please sign up or login with your details

Forgot password? Click here to reset