Interpreting Robustness Proofs of Deep Neural Networks

01/31/2023
by   Debangshu Banerjee, et al.
0

In recent years numerous methods have been developed to formally verify the robustness of deep neural networks (DNNs). Though the proposed techniques are effective in providing mathematical guarantees about the DNNs behavior, it is not clear whether the proofs generated by these methods are human-interpretable. In this paper, we bridge this gap by developing new concepts, algorithms, and representations to generate human understandable interpretations of the proofs. Leveraging the proposed method, we show that the robustness proofs of standard DNNs rely on spurious input features, while the proofs of DNNs trained to be provably robust filter out even the semantically meaningful features. The proofs for the DNNs combining adversarial and provably robust training are the most effective at selectively filtering out spurious features as well as relying on human-understandable input features.

READ FULL TEXT

page 7

page 17

page 18

page 19

page 20

page 22

research
10/07/2021

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

Deep neural networks (DNNs) are known to be vulnerable to adversarial at...
research
06/09/2022

DORA: Exploring outlier representations in Deep Neural Networks

Deep Neural Networks (DNNs) draw their power from the representations th...
research
04/09/2019

Towards Analyzing Semantic Robustness of Deep Neural Networks

Despite the impressive performance of Deep Neural Networks (DNNs) on var...
research
02/25/2023

Bayesian Neural Networks Tend to Ignore Complex and Sensitive Concepts

In this paper, we focus on mean-field variational Bayesian Neural Networ...
research
06/06/2019

Understanding Adversarial Behavior of DNNs by Disentangling Non-Robust and Robust Components in Performance Metric

The vulnerability to slight input perturbations is a worrying yet intrig...
research
06/14/2018

Hierarchical interpretations for neural network predictions

Deep neural networks (DNNs) have achieved impressive predictive performa...
research
10/25/2017

Deep Neural Networks

Deep Neural Networks (DNNs) are universal function approximators providi...

Please sign up or login with your details

Forgot password? Click here to reset