Guidance Through Surrogate: Towards a Generic Diagnostic Attack

12/30/2022
by   Muzammal Naseer, et al.
0

Adversarial training is an effective approach to make deep neural networks robust against adversarial attacks. Recently, different adversarial training defenses are proposed that not only maintain a high clean accuracy but also show significant robustness against popular and well studied adversarial attacks such as PGD. High adversarial robustness can also arise if an attack fails to find adversarial gradient directions, a phenomenon known as `gradient masking'. In this work, we analyse the effect of label smoothing on adversarial training as one of the potential causes of gradient masking. We then develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA). Our attack approach is based on a `match and deceive' loss that finds optimal adversarial directions through guidance from a surrogate model. Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size. Furthermore, our proposed G-PGA is generic, thus it can be combined with an ensemble attack strategy as we demonstrate for the case of Auto-Attack, leading to efficiency and convergence speed improvements. More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.

READ FULL TEXT

page 1

page 4

page 5

page 9

page 12

research
11/30/2020

Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses

Advances in the development of adversarial attacks have been fundamental...
research
06/16/2022

Analysis and Extensions of Adversarial Training for Video Classification

Adversarial training (AT) is a simple yet effective defense against adve...
research
11/26/2018

Bilateral Adversarial Training: Towards Fast Training of More Robust Models Against Adversarial Attacks

In this paper, we study fast training of adversarially robust models. Fr...
research
10/07/2021

Improving Adversarial Robustness for Free with Snapshot Ensemble

Adversarial training, as one of the few certified defenses against adver...
research
07/30/2023

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

Adversarial attacks are usually expressed in terms of a gradient-based o...
research
03/22/2023

Distribution-restrained Softmax Loss for the Model Robustness

Recently, the robustness of deep learning models has received widespread...
research
12/15/2022

Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks

Designing powerful adversarial attacks is of paramount importance for th...

Please sign up or login with your details

Forgot password? Click here to reset