Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection

04/04/2020
by   Hanjie Chen, et al.
0

Generating explanations for neural networks has become crucial for their applications in real-world with respect to reliability and trustworthiness. In natural language processing, existing methods usually provide important features which are words or phrases selected from an input text as an explanation, but ignore the interactions between them. It poses challenges for humans to interpret an explanation and connect it to model prediction. In this work, we build hierarchical explanations by detecting feature interactions. Such explanations visualize how words and phrases are combined at different levels of the hierarchy, which can help users understand the decision-making of black-box models. The proposed method is evaluated with three neural text classifiers (LSTM, CNN, and BERT) on two benchmark datasets, via both automatic and human evaluations. Experiments show the effectiveness of the proposed method in providing explanations that are both faithful to models and interpretable to humans.

READ FULL TEXT

page 7

page 8

page 11

page 14

research
02/20/2022

Hierarchical Interpretation of Neural Text Classification

Recent years have witnessed increasing interests in developing interpret...
research
04/09/2021

Explaining Neural Network Predictions on Sentence Pairs via Learning Word-Group Masks

Explaining neural network models is important for increasing their trust...
research
10/01/2020

Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers

To build an interpretable neural text classifier, most of the prior work...
research
10/24/2022

Generating Hierarchical Explanations on Text Classification Without Connecting Rules

The opaqueness of deep NLP models has motivated the development of metho...
research
08/29/2019

Human-grounded Evaluations of Explanation Methods for Text Classification

Due to the black-box nature of deep learning models, methods for explain...
research
11/08/2019

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

The impressive performance of neural networks on natural language proces...
research
03/18/2021

Refining Neural Networks with Compositional Explanations

Neural networks are prone to learning spurious correlations from biased ...

Please sign up or login with your details

Forgot password? Click here to reset