Program Classification Using Gated Graph Attention Neural Network for Online Programming Service

by   Mingming Lu, et al.

The online programing services, such as Github,TopCoder, and EduCoder, have promoted a lot of social interactions among the service users. However, the existing social interactions is rather limited and inefficient due to the rapid increasing of source-code repositories, which is difficult to explore manually. The emergence of source-code mining provides a promising way to analyze those source codes, so that those source codes can be relatively easy to understand and share among those service users. Among all the source-code mining attempts,program classification lays a foundation for various tasks related to source-code understanding, because it is impossible for a machine to understand a computer program if it cannot classify the program correctly. Although numerous machine learning models, such as the Natural Language Processing (NLP) based models and the Abstract Syntax Tree (AST) based models, have been proposed to classify computer programs based on their corresponding source codes, the existing works cannot fully characterize the source codes from the perspective of both the syntax and semantic information. To address this problem, we proposed a Graph Neural Network (GNN) based model, which integrates data flow and function call information to the AST,and applies an improved GNN model to the integrated graph, so as to achieve the state-of-art program classification accuracy. The experiment results have shown that the proposed work can classify programs with accuracy over 97


page 1

page 11


Modular Tree Network for Source Code Representation Learning

Learning representation for source code is a foundation of many program ...

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Program representation, which aims at converting program source code int...

TreeCaps: Tree-Based Capsule Networks for Source Code Processing

Recently program learning techniques have been proposed to process sourc...

Towards Fully Declarative Program Analysis via Source Code Transformation

Advances in logic programming and increasing industrial uptake of Datalo...

AstBERT: Enabling Language Model for Code Understanding with Abstract Syntax Tree

Using a pre-trained language model (i.e. BERT) to apprehend source codes...

COSET: A Benchmark for Evaluating Neural Program Embeddings

Neural program embedding can be helpful in analyzing large software, a t...

Learning Semantic Program Embeddings with GraphInterval Neural Network

Learning distributed representations of source code has been a challengi...

Please sign up or login with your details

Forgot password? Click here to reset