Path-Based Function Embedding and its Application to Specification Mining

02/21/2018
by   Daniel Defreez, et al.
0

Relationships among program elements is useful for program understanding, debugging, and analysis. One such kind of relationship is synonymous functions. Function synonyms are functions that play a similar role in code; examples include functions that perform initialization for different device drivers, and functions that implement different symmetric-key encryption schemes. Function synonyms are not necessarily semantically equivalent and can be syntactically dissimilar; consequently, approaches for identifying code clones or functional equivalence cannot be used to identify them. This paper presents func2vec, an algorithm that maps each function to a vector in a vector space such that function synonyms are grouped together. We compute the function embedding by training a neural network using sentences generated using random walks of the interprocedural control-flow graph. We show the effectiveness of func2vec in identifying function synonyms in the Linux kernel. Furthermore, we show how knowing function synonyms enables mining error-handling specifications with high support in Linux file systems and drivers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2022

Assisted Specification of Code Using Search

We describe an intelligent assistant based on mining existing software r...
research
11/08/2022

Basis for a vector space generated by Hamiltonian time paths in a complete time graph

In this paper we introduce the notion of a complete time graph of order ...
research
10/23/2018

Unsupervised Features Extraction for Binary Similarity Using Graph Embedding Neural Networks

In this paper we consider the binary similarity problem that consists in...
research
06/19/2018

Neural Code Comprehension: A Learnable Representation of Code Semantics

With the recent success of embeddings in natural language processing, re...
research
06/03/2019

Probabilistic Existence Results for Parent-Identifying Schemes

Parent-identifying schemes provide a way to identify causes from effects...
research
01/22/2021

PEQcheck: Localized and Context-aware Checking of Functional Equivalence (Technical Report)

Refactorings must not alter the program's functionality. However, not al...
research
09/06/2021

FDFB: Full Domain Functional Bootstrapping Towards Practical Fully Homomorphic Encryption

Computation on ciphertexts of all known fully homomorphic encryption (FH...

Please sign up or login with your details

Forgot password? Click here to reset