CLAWSAT: Towards Both Robust and Accurate Code Models

by   Jinghan Jia, et al.

We integrate contrastive learning (CL) with adversarial learning to co-optimize the robustness and accuracy of code models. Different from existing works, we show that code obfuscation, a standard code transformation operation, provides novel means to generate complementary `views' of a code that enable us to achieve both robust and accurate code models. To the best of our knowledge, this is the first systematic study to explore and exploit the robustness and accuracy benefits of (multi-view) code obfuscations in code models. Specifically, we first adopt adversarial codes as robustness-promoting views in CL at the self-supervised pre-training phase. This yields improved robustness and transferability for downstream tasks. Next, at the supervised fine-tuning stage, we show that adversarial training with a proper temporally-staggered schedule of adversarial code generation can further improve robustness and accuracy of the pre-trained code model. Built on the above two modules, we develop CLAWSAT, a novel self-supervised learning (SSL) framework for code by integrating CL with adversarial views (CLAW) with staggered adversarial training (SAT). On evaluating three downstream tasks across Python and Java, we show that CLAWSAT consistently yields the best robustness and accuracy (e.g. 11% in robustness and 6% in accuracy on the code summarization task in Python). We additionally demonstrate the effectiveness of adversarial learning in CLAW by analyzing the characteristics of the loss landscape and interpretability of the pre-trained models.


page 1

page 2

page 3

page 4


Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning

Pretrained models from self-supervision are prevalently used in fine-tun...

PointACL:Adversarial Contrastive Learning for Robust Point Clouds Representation under Adversarial Attack

Despite recent success of self-supervised based contrastive learning mod...

Adversarial Training for Face Recognition Systems using Contrastive Adversarial Learning and Triplet Loss Fine-tuning

Though much work has been done in the domain of improving the adversaria...

On visual self-supervision and its effect on model robustness

Recent self-supervision methods have found success in learning feature r...

Self-Supervised Learning for Code Retrieval and Summarization through Semantic-Preserving Program Transformations

Code retrieval and summarization are useful tasks for developers, but it...

On Visual Hallmarks of Robustness to Adversarial Malware

A central challenge of adversarial learning is to interpret the resultin...

Adversarial Robustness for Code

We propose a novel technique which addresses the challenge of learning a...

Please sign up or login with your details

Forgot password? Click here to reset