Explainable Learning: Implicit Generative Modelling during Training for Adversarial Robustness

07/05/2018
by   Priyadarshini Panda, et al.
4

We introduce Explainable Learning ,ExL, an approach for training neural networks that are intrinsically robust to adversarial attacks. We find that the implicit generative modelling of random noise, during posterior maximization, improves a model's understanding of the data manifold furthering adversarial robustness. We prove our approach's efficacy and provide a simplistic visualization tool for understanding adversarial data, using Principal Component Analysis. Our analysis reveals that adversarial robustness, in general, manifests in models with higher variance along the high-ranked principal components. We show that models learnt with ExL perform remarkably well against a wide-range of black-box attacks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro