One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

10/07/2021
by   Stephen Casper, et al.
11

It is well understood that modern deep networks are vulnerable to adversarial attacks. However, conventional methods fail to produce adversarial perturbations that are intelligible to humans, and they pose limited threats in the physical world. To study feature-class associations in networks and better understand the real-world threats they face, we develop feature-level adversarial perturbations using deep image generators and a novel optimization objective. We term these feature-fool attacks. We show that they are versatile and use them to generate targeted feature-level attacks at the ImageNet scale that are simultaneously interpretable, universal to any source image, and physically-realizable. These attacks can also reveal spurious, semantically-describable feature/class associations, and we use them to guide the design of "copy/paste" adversaries in which one natural image is pasted into another to cause a targeted misclassification.

READ FULL TEXT

page 2

page 5

page 7

page 9

page 15

page 16

page 19

page 20

research
10/07/2020

Double Targeted Universal Adversarial Perturbations

Despite their impressive performance, deep neural networks (DNNs) are wi...
research
11/15/2019

Simple iterative method for generating targeted universal adversarial perturbations

Deep neural networks (DNNs) are vulnerable to adversarial attacks. In pa...
research
08/08/2019

Universal Adversarial Audio Perturbations

We demonstrate the existence of universal adversarial perturbations, whi...
research
11/19/2020

Multi-Task Adversarial Attack

Deep neural networks have achieved impressive performance in various are...
research
10/28/2020

Transferable Universal Adversarial Perturbations Using Generative Models

Deep neural networks tend to be vulnerable to adversarial perturbations,...
research
03/21/2019

Adversarial camera stickers: A physical camera-based attack on deep learning systems

Recent work has thoroughly documented the susceptibility of deep learnin...
research
08/08/2018

Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer

Many machine learning image classifiers are vulnerable to adversarial at...

Please sign up or login with your details

Forgot password? Click here to reset