Neural Activation Patterns (NAPs): Visual Explainability of Learned Concepts

06/20/2022
by   Alex Bäuerle, et al.
0

A key to deciphering the inner workings of neural networks is understanding what a model has learned. Promising methods for discovering learned features are based on analyzing activation values, whereby current techniques focus on analyzing high activation values to reveal interesting features on a neuron level. However, analyzing high activation values limits layer-level concept discovery. We present a method that instead takes into account the entire activation distribution. By extracting similar activation profiles within the high-dimensional activation space of a neural network layer, we find groups of inputs that are treated similarly. These input groups represent neural activation patterns (NAPs) and can be used to visualize and interpret learned layer concepts. We release a framework with which NAPs can be extracted from pre-trained models and provide a visual introspection tool that can be used to analyze NAPs. We tested our method with a variety of networks and show how it complements existing methods for analyzing neural network activation values.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 7

page 8

page 9

research
11/21/2018

Neural Networks with Activation Networks

This work presents an adaptive activation method for neural networks tha...
research
08/30/2022

Constraining Representations Yields Models That Know What They Don't Know

A well-known failure mode of neural networks corresponds to high confide...
research
08/25/2021

Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images

Deep learning models have shown their superior performance in various vi...
research
06/02/2019

NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions

As deep neural networks are increasingly used in solving high-stake prob...
research
11/30/2017

TCAV: Relative concept importance testing with Linear Concept Activation Vectors

Neural networks commonly offer high utility but remain difficult to inte...
research
06/12/2023

Adversarial Attacks on the Interpretation of Neuron Activation Maximization

The internal functional behavior of trained Deep Neural Networks is noto...
research
01/29/2021

The Mind's Eye: Visualizing Class-Agnostic Features of CNNs

Visual interpretability of Convolutional Neural Networks (CNNs) has gain...

Please sign up or login with your details

Forgot password? Click here to reset