TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness

04/01/2021
by   Zhuolin Yang, et al.
0

Adversarial Transferability is an intriguing property of adversarial examples – a perturbation that is crafted against one model is also effective against another model, which may arise from a different model family or training process. To better protect ML systems against adversarial attacks, several questions are raised: what are the sufficient conditions for adversarial transferability? Is it possible to bound such transferability? Is there a way to reduce the transferability in order to improve the robustness of an ensemble ML model? To answer these questions, we first theoretically analyze sufficient conditions for transferability between models and propose a practical algorithm to reduce transferability within an ensemble to improve its robustness. Our theoretical analysis shows only the orthogonality between gradients of different models is not enough to ensure low adversarial transferability: the model smoothness is also an important factor. In particular, we provide a lower/upper bound of adversarial transferability based on model gradient similarity for low risk classifiers based on gradient orthogonality and model smoothness. We demonstrate that under the condition of gradient orthogonality, smoother classifiers will guarantee lower adversarial transferability. Furthermore, we propose an effective Transferability Reduced Smooth-ensemble(TRS) training strategy to train a robust ensemble with low transferability by enforcing model smoothness and gradient orthogonality between base models. We conduct extensive experiments on TRS by comparing with other state-of-the-art baselines on different datasets, showing that the proposed TRS outperforms all baselines significantly. We believe our analysis on adversarial transferability will inspire future research towards developing robust ML models taking these adversarial transferability properties into account.

READ FULL TEXT

page 6

page 18

page 19

research
09/21/2020

Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness

Ensemble-based adversarial training is a principled approach to achieve ...
research
02/03/2022

Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization

Machine learning (ML) robustness and domain generalization are fundament...
research
05/12/2020

Evaluating Ensemble Robustness Against Adversarial Attacks

Adversarial examples, which are slightly perturbed inputs generated with...
research
05/14/2021

High-Robustness, Low-Transferability Fingerprinting of Neural Networks

This paper proposes Characteristic Examples for effectively fingerprinti...
research
02/09/2022

Agree to Disagree: Diversity through Disagreement for Better Transferability

Gradient-based learning algorithms have an implicit simplicity bias whic...
research
07/15/2023

Why Does Little Robustness Help? Understanding Adversarial Transferability From Surrogate Training

Adversarial examples (AEs) for DNNs have been shown to be transferable: ...
research
04/07/2022

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

One intriguing property of adversarial attacks is their "transferability...

Please sign up or login with your details

Forgot password? Click here to reset