Sound Explanation for Trustworthy Machine Learning

06/08/2023
by   Kai Jia, et al.
0

We take a formal approach to the explainability problem of machine learning systems. We argue against the practice of interpreting black-box models via attributing scores to input components due to inherently conflicting goals of attribution-based interpretation. We prove that no attribution algorithm satisfies specificity, additivity, completeness, and baseline invariance. We then formalize the concept, sound explanation, that has been informally adopted in prior work. A sound explanation entails providing sufficient information to causally explain the predictions made by a system. Finally, we present the application of feature selection as a sound explanation for cancer prediction models to cultivate trust among clinicians.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2023

On Formal Feature Attribution and Its Approximation

Recent years have witnessed the widespread use of artificial intelligenc...
research
08/18/2023

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box

Attribution methods shed light on the explainability of data-driven appr...
research
05/31/2019

Regularizing Black-box Models for Improved Interpretability (HILL 2019 Version)

Most of the work on interpretable machine learning has focused on design...
research
12/20/2019

Learned Feature Attribution Priors

Deep learning models have achieved breakthrough successes in domains whe...
research
09/30/2011

Causes of Ineradicable Spurious Predictions in Qualitative Simulation

It was recently proved that a sound and complete qualitative simulator d...
research
06/08/2020

A Baseline for Shapely Values in MLPs: from Missingness to Neutrality

Being able to explain a prediction as well as having a model that perfor...

Please sign up or login with your details

Forgot password? Click here to reset