Gradient Adversarial Training of Neural Networks

by   Ayan Sinha, et al.

We propose gradient adversarial training, an auxiliary deep learning framework applicable to different machine learning problems. In gradient adversarial training, we leverage a prior belief that in many contexts, simultaneous gradient updates should be statistically indistinguishable from each other. We enforce this consistency using an auxiliary network that classifies the origin of the gradient tensor, and the main network serves as an adversary to the auxiliary network in addition to performing standard task-based training. We demonstrate gradient adversarial training for three different scenarios: (1) as a defense to adversarial examples we classify gradient tensors and tune them to be agnostic to the class of their corresponding example, (2) for knowledge distillation, we do binary classification of gradient tensors derived from the student or teacher network and tune the student gradient tensor to mimic the teacher's gradient tensor; and (3) for multi-task learning we classify the gradient tensors derived from different task loss functions and tune them to be statistically indistinguishable. For each of the three scenarios we show the potential of gradient adversarial training procedure. Specifically, gradient adversarial training increases the robustness of a network to adversarial attacks, is able to better distill the knowledge from a teacher network to a student network compared to soft targets, and boosts multi-task learning by aligning the gradient tensors derived from the task specific loss functions. Overall, our experiments demonstrate that gradient tensors contain latent information about whatever tasks are being trained, and can support diverse machine learning problems when intelligently guided through adversarialization using a auxiliary network.


ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation

Adversarial training is the most promising method for learning robust mo...

Mitigating the Accuracy-Robustness Trade-off via Multi-Teacher Adversarial Distillation

Adversarial training is a practical approach for improving the robustnes...

Understanding Robustness in Teacher-Student Setting: A New Perspective

Adversarial examples have appeared as a ubiquitous property of machine l...

Transferring Adversarial Robustness Through Robust Representation Matching

With the widespread use of machine learning, concerns over its security ...

Self-Training and Multi-Task Learning for Limited Data: Evaluation Study on Object Detection

Self-training allows a network to learn from the predictions of a more c...

AID-Purifier: A Light Auxiliary Network for Boosting Adversarial Defense

We propose an AID-purifier that can boost the robustness of adversariall...

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models

The state-of-the-art Mixture-of-Experts (short as MoE) architecture has ...

Please sign up or login with your details

Forgot password? Click here to reset