Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations

09/19/2020
by   Yuan Zang, et al.
0

Adversarial attacking aims to fool deep neural networks with adversarial examples. In the field of natural language processing, various textual adversarial attack models have been proposed, varying in the accessibility to the victim model. Among them, the attack models that only require the output of the victim model are more fit for real-world situations of adversarial attacking. However, to achieve high attack performance, these models usually need to query the victim model too many times, which is neither efficient nor viable in practice. To tackle this problem, we propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently. In experiments, we evaluate our model by attacking several state-of-the-art models on the benchmark datasets of multiple tasks including sentiment analysis, text classification and natural language inference. Experimental results demonstrate that our model consistently achieves both better attack performance and higher efficiency than recently proposed baseline methods. We also find our attack model can bring more robustness improvement to the victim model by adversarial training. All the code and data of this paper will be made public.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2020

A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Generating adversarial examples for natural language is hard, as natural...
research
09/06/2021

Efficient Combinatorial Optimization for Word-level Adversarial Textual Attack

Over the past few years, various word-level textual attack approaches ha...
research
11/20/2020

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

Backdoor attacks, which are a kind of emergent training-time threat to d...
research
09/09/2021

Multi-granularity Textual Adversarial Attack with Behavior Cloning

Recently, the textual adversarial attack models become increasingly popu...
research
11/02/2021

HydraText: Multi-objective Optimization for Adversarial Textual Attack

The field of adversarial textual attack has significantly grown over the...
research
06/19/2020

Systematic Attack Surface Reduction For Deployed Sentiment Analysis Models

This work proposes a structured approach to baselining a model, identify...
research
10/28/2021

Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework

Despite great success on many machine learning tasks, deep neural networ...

Please sign up or login with your details

Forgot password? Click here to reset