Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps

04/23/2022
by   Oren Barkan, et al.
19

Transformer-based language models significantly advanced the state-of-the-art in many linguistic tasks. As this revolution continues, the ability to explain model predictions has become a major area of interest for the NLP community. In this work, we present Gradient Self-Attention Maps (Grad-SAM) - a novel gradient-based method that analyzes self-attention units and identifies the input elements that explain the model's prediction the best. Extensive evaluations on various benchmarks show that Grad-SAM obtains significant improvements over state-of-the-art alternatives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2021

Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

Transformers are increasingly dominating multi-modal reasoning tasks, su...
research
07/19/2022

Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks

Domestic service robots that support daily tasks are a promising solutio...
research
01/19/2023

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Generative transformer models have become increasingly complex, with lar...
research
06/28/2022

ZoDIAC: Zoneout Dropout Injection Attention Calculation

Recently the use of self-attention has yielded to state-of-the-art resul...
research
08/12/2019

On the Validity of Self-Attention as Explanation in Transformer Models

Explainability of deep learning systems is a vital requirement for many ...
research
04/06/2019

Reinforcement Learning with Attention that Works: A Self-Supervised Approach

Attention models have had a significant positive impact on deep learning...
research
02/17/2021

Centroid Transformers: Learning to Abstract with Attention

Self-attention, as the key block of transformers, is a powerful mechanis...

Please sign up or login with your details

Forgot password? Click here to reset