Geometrically Guided Integrated Gradients

06/13/2022
by   Md Mahfuzur Rahman, et al.
24

Interpretability methods for deep neural networks mainly focus on the sensitivity of the class score with respect to the original or perturbed input, usually measured using actual or modified gradients. Some methods also use a model-agnostic approach to understanding the rationale behind every prediction. In this paper, we argue and demonstrate that local geometry of the model parameter space relative to the input can also be beneficial for improved post-hoc explanations. To achieve this goal, we introduce an interpretability method called "geometrically-guided integrated gradients" that builds on top of the gradient calculation along a linear path as traditionally used in integrated gradient methods. However, instead of integrating gradient information, our method explores the model's dynamic behavior from multiple scaled versions of the input and captures the best possible attribution for each input. We demonstrate through extensive experiments that the proposed approach outperforms vanilla and integrated gradients in subjective and quantitative assessment. We also propose a "model perturbation" sanity check to complement the traditionally used "model randomization" test.

READ FULL TEXT

page 6

page 7

page 8

page 14

page 15

page 16

page 18

page 19

research
04/22/2020

Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network Attribution

Integrated gradients as an attribution method for deep neural network mo...
research
10/23/2020

Investigating Saturation Effects in Integrated Gradients

Integrated Gradients has become a popular method for post-hoc model inte...
research
06/17/2021

Guided Integrated Gradients: An Adaptive Path Method for Removing Noise

Integrated Gradients (IG) is a commonly used feature attribution method ...
research
07/21/2020

Pattern-Guided Integrated Gradients

Integrated Gradients (IG) and PatternAttribution (PA) are two establishe...
research
05/31/2023

Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision

Attribution algorithms are frequently employed to explain the decisions ...
research
03/24/2023

IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients

Integrated Gradients (IG) as well as its variants are well-known techniq...
research
02/25/2021

Do Input Gradients Highlight Discriminative Features?

Interpretability methods that seek to explain instance-specific model pr...

Please sign up or login with your details

Forgot password? Click here to reset