Improving Adversarial Robustness in Weight-quantized Neural Networks
Neural networks are getting deeper and more computation-intensive nowadays. Quantization is a useful technique in deploying neural networks on hardware platforms and saving computation costs with negligible performance loss. However, recent research reveals that neural network models, no matter full-precision or quantized, are vulnerable to adversarial attacks. In this work, we analyze both adversarial and quantization losses and then introduce criteria to evaluate them. We propose a boundary-based retraining method to mitigate adversarial and quantization losses together and adopt a nonlinear mapping method to defend against white-box gradient-based adversarial attacks. The evaluations demonstrate that our method can better restore accuracy after quantization than other baseline methods on both black-box and white-box adversarial attacks. The results also show that adversarial training suffers quantization loss and does not cooperate well with other training methods.
READ FULL TEXT