Exploring Representation of Horn Clauses using GNNs (technique report)

by   Chencheng Liang, et al.

Learning program semantics from raw source code is challenging due to the complexity of real-world programming language syntax and due to the difficulty of reconstructing long-distance relational information implicitly represented in programs using identifiers. Addressing the first point, we consider Constrained Horn Clauses (CHCs) as a standard representation of program verification problems, providing a simple and programming language-independent syntax. For the second challenge, we explore graph representations of CHCs, and propose a new Relational Hypergraph Neural Network (R-HyGNN) architecture to learn program features. We introduce two different graph representations of CHCs. One is called constraint graph (CG), and emphasizes syntactic information of CHCs by translating the symbols and their relations in CHCs as typed nodes and binary edges, respectively, and constructing the constraints as abstract syntax trees. The second one is called control- and data-flow hypergraph (CDHG), and emphasizes semantic information of CHCs by representing the control and data flow through ternary hyperedges. We then propose a new GNN architecture, R-HyGNN, extending Relational Graph Convolutional Networks, to handle hypergraphs. To evaluate the ability of R-HyGNN to extract semantic information from programs, we use R-HyGNNs to train models on the two graph representations, and on five proxy tasks with increasing difficulty, using benchmarks from CHC-COMP 2021 as training data. The most difficult proxy task requires the model to predict the occurrence of clauses in counter-examples, which subsumes satisfiability of CHCs. CDHG achieves 90.59 task. Furthermore, R-HyGNN has perfect predictions on one of the graphs consisting of more than 290 clauses. Overall, our experiments indicate that R-HyGNN can capture intricate program features for guiding verification problems.


page 1

page 2

page 3

page 4


Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Program representation, which aims at converting program source code int...

Learning to Represent Programs with Heterogeneous Graphs

Program source code contains complex structure information, which can be...

Heterogeneous Directed Hypergraph Neural Network over abstract syntax tree (AST) for Code Classification

Code classification is a difficult issue in program understanding and au...

Detecting Code Clones with Graph Neural Networkand Flow-Augmented Abstract Syntax Tree

Code clones are semantically similar code fragments pairs that are synta...

Code Representation Pre-training with Complements from Program Executions

Large language models (LLMs) for natural language processing have been g...

Comparison of Syntactic and Semantic Representations of Programs in Neural Embeddings

Neural approaches to program synthesis and understanding have proliferat...

HEAT: Hyperedge Attention Networks

Learning from structured data is a core machine learning task. Commonly,...

Please sign up or login with your details

Forgot password? Click here to reset