Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

07/05/2023
by   Md Abdul Kadir, et al.
0

Ensuring the trustworthiness and interpretability of machine learning models is critical to their deployment in real-world applications. Feature attribution methods have gained significant attention, which provide local explanations of model predictions by attributing importance to individual input features. This study examines the generalization of feature attributions across various deep learning architectures, such as convolutional neural networks (CNNs) and vision transformers. We aim to assess the feasibility of utilizing a feature attribution method as a future detector and examine how these features can be harmonized across multiple models employing distinct architectures but trained on the same data distribution. By exploring this harmonization, we aim to develop a more coherent and optimistic understanding of feature attributions, enhancing the consistency of local explanations across diverse deep-learning models. Our findings highlight the potential for harmonized feature attribution methods to improve interpretability and foster trust in machine learning applications, regardless of the underlying architecture.

READ FULL TEXT
research
01/23/2021

Show or Suppress? Managing Input Uncertainty in Machine Learning Model Explanations

Feature attribution is widely used in interpretable machine learning to ...
research
03/21/2023

Using Explanations to Guide Models

Deep neural networks are highly performant, but might base their decisio...
research
12/20/2019

Learned Feature Attribution Priors

Deep learning models have achieved breakthrough successes in domains whe...
research
06/18/2021

NoiseGrad: enhancing explanations by introducing stochasticity to model weights

Attribution methods remain a practical instrument that is used in real-w...
research
07/19/2021

Path Integrals for the Attribution of Model Uncertainties

Enabling interpretations of model uncertainties is of key importance in ...
research
12/22/2022

Impossibility Theorems for Feature Attribution

Despite a sea of interpretability methods that can produce plausible exp...
research
03/24/2023

Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives

With the dramatic advances in deep learning technology, machine learning...

Please sign up or login with your details

Forgot password? Click here to reset