XAI for Transformers: Better Explanations through Conservative Propagation

02/15/2022
by   Ameen Ali, et al.
59

Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such unreliable explanations and propose a more stable way for propagation through these layers. Our proposal, which can be seen as a proper extension of the well-established LRP method to Transformers, is shown both theoretically and empirically to overcome the deficiency of a simple gradient-based approach, and achieves state-of-the-art explanation performance on a broad range of Transformer models and datasets.

READ FULL TEXT

page 16

page 17

research
01/20/2023

Holistically Explainable Vision Transformers

Transformers increasingly dominate the machine learning landscape across...
research
05/25/2023

Concept-Centric Transformers: Concept Transformers with Object-Centric Concept Learning for Interpretability

Attention mechanisms have greatly improved the performance of deep-learn...
research
06/10/2022

Learning to Estimate Shapley Values with Vision Transformers

Transformers have become a default architecture in computer vision, but ...
research
05/21/2023

Explaining How Transformers Use Context to Build Predictions

Language Generation Models produce words based on the previous context. ...
research
10/02/2022

DARTFormer: Finding The Best Type Of Attention

Given the wide and ever growing range of different efficient Transformer...
research
01/19/2023

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Generative transformer models have become increasingly complex, with lar...
research
06/01/2022

On Layer Normalizations and Residual Connections in Transformers

In the perspective of a layer normalization (LN) position, the architect...

Please sign up or login with your details

Forgot password? Click here to reset