How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations

01/21/2021
by   Sérgio Jesus, et al.
0

There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hurt the overall performance of the combined system of ML model + end-users. This study aims to bridge this gap by proposing XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information. We conducted an experiment following XAI Test to evaluate three popular post-hoc explanation methods – LIME, SHAP, and TreeInterpreter – on a real-world fraud detection task, with real data, a deployed ML model, and fraud analysts. During the experiment, we gradually increased the information provided to the fraud analysts in three stages: Data Only, i.e., just transaction data without access to model score nor explanations, Data + ML Model Score, and Data + ML Model Score + Explanations. Using strong statistical analysis, we show that, in general, these popular explainers have a worse impact than desired. Some of the conclusion highlights include: i) showing Data Only results in the highest decision accuracy and the slowest decision time among all variants tested, ii) all the explainers improve accuracy over the Data + ML Model Score variant but still result in lower accuracy when compared with Data Only; iii) LIME was the least preferred by users, probably due to its substantially lower variability of explanations from case to case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2023

A Case Study on Designing Evaluations of ML Explanations with Simulated User Studies

When conducting user studies to ascertain the usefulness of model explan...
research
07/07/2019

A Human-Grounded Evaluation of SHAP for Alert Processing

In the past years, many new explanation methods have been proposed to ac...
research
06/09/2023

Strategies to exploit XAI to improve classification systems

Explainable Artificial Intelligence (XAI) aims to provide insights into ...
research
12/04/2020

Challenging common interpretability assumptions in feature attribution explanations

As machine learning and algorithmic decision making systems are increasi...
research
09/22/2020

Local Post-Hoc Explanations for Predictive Process Monitoring in Manufacturing

This study proposes an innovative explainable process prediction solutio...
research
06/24/2022

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods

Machine Learning (ML) models now inform a wide range of human decisions,...

Please sign up or login with your details

Forgot password? Click here to reset