Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

04/07/2020
by   Alon Jacovi, et al.
0

With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion piece we reflect on the current state of interpretability evaluation research. We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria. We survey the literature with respect to faithfulness evaluation, and arrange the current approaches around three assumptions, providing an explicit form to how faithfulness is "defined" by the community. We provide concrete guidelines on how evaluation of interpretation methods should and should not be conducted. Finally, we claim that the current binary definition for faithfulness sets a potentially unrealistic bar for being considered faithful. We call for discarding the binary notion of faithfulness in favor of a more graded one, which we believe will be of greater practical utility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2020

Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability

Recent years have witnessed an increasing number of interpretation metho...
research
05/29/2021

The Definitions of Interpretability and Learning of Interpretable Models

As machine learning algorithms getting adopted in an ever-increasing num...
research
07/12/2017

A Formal Framework to Characterize Interpretability of Procedures

We provide a novel notion of what it means to be interpretable, looking ...
research
12/28/2020

A Survey on Neural Network Interpretability

Along with the great success of deep neural networks, there is also grow...
research
06/09/2017

TIP: Typifying the Interpretability of Procedures

We provide a novel notion of what it means to be interpretable, looking ...
research
09/29/2020

Utility is in the Eye of the User: A Critique of NLP Leaderboards

Benchmarks such as GLUE have helped drive advances in NLP by incentivizi...

Please sign up or login with your details

Forgot password? Click here to reset