Detecting Anomalous Inputs to DNN Classifiers By Joint Statistical Testing at the Layers

by   Jayaram Raghuram, et al.

Detecting anomalous inputs, such as adversarial and out-of-distribution (OOD) inputs, is critical for classifiers deployed in real-world applications, especially deep neural network (DNN) classifiers that are known to be brittle on such inputs. We propose an unsupervised statistical testing framework for detecting such anomalous inputs to a trained DNN classifier based on its internal layer representations. By calculating test statistics at the input and intermediate-layer representations of the DNN, conditioned individually on the predicted class and on the true class of labeled training data, the method characterizes their class-conditional distributions on natural inputs. Given a test input, its extent of non-conformity with respect to the training distribution is captured using p-values of the class-conditional test statistics across the layers, which are then combined using a scoring function designed to score high on anomalous inputs. We focus on adversarial inputs, which are an important class of anomalous inputs, and also demonstrate the effectiveness of our method on general OOD inputs. The proposed framework also provides an alternative class prediction that can be used to correct the DNNs prediction on (detected) adversarial inputs. Experiments on well-known image classification datasets with strong adversarial attacks, including a custom attack method that uses the internal layer representations of the DNN, demonstrate that our method outperforms or performs comparably with five state-of-the-art detection methods.


page 1

page 2

page 3

page 4


Anomaly Detection of Test-Time Evasion Attacks using Class-conditional Generative Adversarial Networks

Deep Neural Networks (DNNs) have been shown vulnerable to adversarial (T...

Detecting Out-of-distribution Examples via Class-conditional Impressions Reappearing

Out-of-distribution (OOD) detection aims at enhancing standard deep neur...

DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows

Despite much recent work, detecting out-of-distribution (OOD) inputs and...

A Forgotten Danger in DNN Supervision Testing: Generating and Detecting True Ambiguity

Deep Neural Networks (DNNs) are becoming a crucial component of modern s...

MOOD: Multi-level Out-of-distribution Detection

Out-of-distribution (OOD) detection is essential to prevent anomalous in...

Testing Deep Neural Network based Image Classifiers

Image classification is an important task in today's world with many app...

A Statistical Defense Approach for Detecting Adversarial Examples

Adversarial examples are maliciously modified inputs created to fool dee...

Please sign up or login with your details

Forgot password? Click here to reset