Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks
Modern neural networks are highly non-robust against adversarial manipulation. A significant amount of work has been invested in techniques to compute lower bounds on robustness through formal guarantees and to build provably robust model. However it is still difficult to apply them to larger networks or in order to get robustness against larger perturbations. Thus attack strategies are needed to provide tight upper bounds on the actual robustness. We significantly improve the randomized gradient-free attack for ReLU networks [9], in particular by scaling it up to large networks. We show that our attack achieves similar or significantly smaller robust accuracy than state-of-the-art attacks like PGD or the one of Carlini and Wagner, thus revealing an overestimation of the robustness by these state-of-the-art methods. Our attack is not based on a gradient descent scheme and in this sense gradient-free, which makes it less sensitive to the choice of hyperparameters as no careful selection of the stepsize is required.
READ FULL TEXT