Generalizable Adversarial Examples Detection Based on Bi-model Decision Mismatch

02/21/2018
by   João Monteiro, et al.
0

Deep neural networks (DNNs) have shown phenomenal success in a wide range of applications. However, recent studies have discovered that they are vulnerable to Adversarial Examples, i.e., original samples with added subtle perturbations. Such perturbations are often too small and imperceptible to humans, yet they can easily fool the neural networks. Few defense techniques against adversarial examples have been proposed, but they require modifying the target model or prior knowledge of adversarial examples generation methods. Likewise, their performance remarkably drops upon encountering adversarial example types not used during the training stage. In this paper, we propose a new framework that can be used to enhance DNNs' robustness by detecting adversarial examples. In particular, we employ the decision layer of independently trained models as features for posterior detection. The proposed framework doesn't require any prior knowledge of adversarial examples generation techniques, and can be directly augmented with unmodified off-the-shelf models. Experiments on the standard MNIST and CIFAR10 datasets show that it generalizes well across not only different adversarial examples generation methods but also various additive perturbations. Specifically, distinct binary classifiers trained on top of our proposed features can achieve a high detection rate (>90 performance when tested against unseen attacks.

READ FULL TEXT

page 1

page 5

page 6

research
05/08/2023

Adversarial Examples Detection with Enhanced Image Difference Features based on Local Histogram Equalization

Deep Neural Networks (DNNs) have recently made significant progress in m...
research
07/22/2021

Unsupervised Detection of Adversarial Examples with Model Explanations

Deep Neural Networks (DNNs) have shown remarkable performance in a diver...
research
03/18/2022

Concept-based Adversarial Attacks: Tricking Humans and Classifiers Alike

We propose to generate adversarial samples by modifying activations of u...
research
07/10/2020

Improved Detection of Adversarial Images Using Deep Neural Networks

Machine learning techniques are immensely deployed in both industry and ...
research
11/15/2020

Towards Understanding the Regularization of Adversarial Robustness on Neural Networks

The problem of adversarial examples has shown that modern Neural Network...
research
07/04/2022

Hessian-Free Second-Order Adversarial Examples for Adversarial Learning

Recent studies show deep neural networks (DNNs) are extremely vulnerable...
research
03/11/2018

Detecting Adversarial Examples via Neural Fingerprinting

Deep neural networks are vulnerable to adversarial examples, which drama...

Please sign up or login with your details

Forgot password? Click here to reset