Sentence-Based Model Agnostic NLP Interpretability

12/24/2020
by   Yves Rychener, et al.
10

Today, interpretability of Black-Box Natural Language Processing (NLP) models based on surrogates, like LIME or SHAP, uses word-based sampling to build the explanations. In this paper we explore the use of sentences to tackle NLP interpretability. While this choice may seem straight forward, we show that, when using complex classifiers like BERT, the word-based approach raises issues not only of computational complexity, but also of an out of distribution sampling, eventually leading to non founded explanations. By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2017

Interpretable & Explorable Approximations of Black Box Models

We propose Black Box Explanations through Transparent Approximations (BE...
research
11/22/2016

Programs as Black-Box Explanations

Recent work in model-agnostic explanations of black-box machine learning...
research
10/14/2021

Can Explanations Be Useful for Calibrating Black Box Models?

One often wants to take an existing, trained NLP model and use it on dat...
research
08/10/2021

Post-hoc Interpretability for Neural NLP: A Survey

Natural Language Processing (NLP) models have become increasingly more c...
research
07/09/2021

Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

This paper analyses the fundamental ingredients behind surrogate explana...
research
10/14/2020

Geometry matters: Exploring language examples at the decision boundary

A growing body of recent evidence has highlighted the limitations of nat...
research
11/19/2018

Towards Global Explanations for Credit Risk Scoring

In this paper we propose a method to obtain global explanations for trai...

Please sign up or login with your details

Forgot password? Click here to reset