Automated Static Warning Identification via Path-based Semantic Representation

by   Yuwei Zhang, et al.

Despite their ability to aid developers in detecting potential defects early in the software development life cycle, static analysis tools often suffer from precision issues (i.e., high false positive rates of reported alarms). To improve the availability of these tools, many automated warning identification techniques have been proposed to assist developers in classifying false positive alarms. However, existing approaches mainly focus on using hand-engineered features or statement-level abstract syntax tree token sequences to represent the defective code, failing to capture semantics from the reported alarms. To overcome the limitations of traditional approaches, this paper employs deep neural networks' powerful feature extraction and representation abilities to generate code semantics from control flow graph paths for warning identification. The control flow graph abstractly represents the execution process of a given program. Thus, the generated path sequences of the control flow graph can guide the deep neural networks to learn semantic information about the potential defect more accurately. In this paper, we fine-tune the pre-trained language model to encode the path sequences and capture the semantic representations for model building. Finally, this paper conducts extensive experiments on eight open-source projects to verify the effectiveness of the proposed approach by comparing it with the state-of-the-art baselines.


page 1

page 2

page 3

page 4


Using Multiple Code Representations to Prioritize Static Analysis Warnings

In order to ensure the quality of software and prevent attacks from hack...

Detecting Code Clones with Graph Neural Networkand Flow-Augmented Abstract Syntax Tree

Code clones are semantically similar code fragments pairs that are synta...

AutoPruner: Transformer-Based Call Graph Pruning

Constructing a static call graph requires trade-offs between soundness a...

Software Language Comprehension using a Program-Derived Semantic Graph

Traditional code transformation structures, such as an abstract syntax t...

On the Effect of Semantically Enriched Context Models on Software Modularization

Many of the existing approaches for program comprehension rely on the li...

PEM: Representing Binary Program Semantics for Similarity Analysis via a Probabilistic Execution Model

Binary similarity analysis determines if two binary executables are from...

Generalizing cyclomatic complexity via path homology

Cyclomatic complexity is an incompletely specified but mathematically pr...

Please sign up or login with your details

Forgot password? Click here to reset