Transferring Adversarial Robustness Through Robust Representation Matching

02/21/2022
by   Pratik Vaishnavi, et al.
0

With the widespread use of machine learning, concerns over its security and reliability have become prevalent. As such, many have developed defenses to harden neural networks against adversarial examples, imperceptibly perturbed inputs that are reliably misclassified. Adversarial training in which adversarial examples are generated and used during training is one of the few known defenses able to reliably withstand such attacks against neural networks. However, adversarial training imposes a significant training overhead and scales poorly with model complexity and input dimension. In this paper, we propose Robust Representation Matching (RRM), a low-cost method to transfer the robustness of an adversarially trained model to a new model being trained for the same task irrespective of architectural differences. Inspired by student-teacher learning, our method introduces a novel training loss that encourages the student to learn the teacher's robust representations. Compared to prior works, RRM is superior with respect to both model performance and adversarial training time. On CIFAR-10, RRM trains a robust model ∼ 1.8× faster than the state-of-the-art. Furthermore, RRM remains effective on higher-dimensional datasets. On Restricted-ImageNet, RRM trains a ResNet50 model ∼ 18× faster than standard adversarial training.

READ FULL TEXT
research
04/29/2019

Adversarial Training for Free!

Adversarial training, in which a network is trained on adversarial examp...
research
10/25/2022

Accelerating Certified Robustness Training via Knowledge Transfer

Training deep neural network classifiers that are certifiably robust aga...
research
02/25/2021

Understanding Robustness in Teacher-Student Setting: A New Perspective

Adversarial examples have appeared as a ubiquitous property of machine l...
research
11/09/2021

MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

Deep neural networks are susceptible to adversarially crafted, small and...
research
05/16/2023

Releasing Inequality Phenomena in L_∞-Adversarial Training via Input Gradient Distillation

Since adversarial examples appeared and showed the catastrophic degradat...
research
08/20/2021

ASAT: Adaptively Scaled Adversarial Training in Time Series

Adversarial training is a method for enhancing neural networks to improv...
research
06/21/2018

Gradient Adversarial Training of Neural Networks

We propose gradient adversarial training, an auxiliary deep learning fra...

Please sign up or login with your details

Forgot password? Click here to reset