Topology-Aware Hashing for Effective Control Flow Graph Similarity Analysis

04/14/2020
by   Yuping Li, et al.
0

Control Flow Graph (CFG) similarity analysis is an essential technique for a variety of security analysis tasks, including malware detection and malware clustering. Even though various algorithms have been developed, existing CFG similarity analysis methods still suffer from limited efficiency, accuracy, and usability. In this paper, we propose a novel fuzzy hashing scheme called topology-aware hashing (TAH) for effective and efficient CFG similarity analysis. Given the CFGs constructed from program binaries, we extract blended n-gram graphical features of the CFGs, encode the graphical features into numeric vectors (called graph signatures), and then measure the graph similarity by comparing the graph signatures. We further employ a fuzzy hashing technique to convert the numeric graph signatures into smaller fixed-size fuzzy hash signatures for efficient similarity calculation. Our comprehensive evaluation demonstrates that TAH is more effective and efficient compared to existing CFG comparison techniques. To demonstrate the applicability of TAH to real-world security analysis tasks, we develop a binary similarity analysis tool based on TAH, and show that it outperforms existing similarity analysis tools while conducting malware clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/24/2022

Transformer-Boosted Anomaly Detection with Fuzzy Hashes

Fuzzy hashes are an important tool in digital forensics and are used in ...
research
11/27/2021

Assessing the Effectiveness of YARA Rules for Signature-Based Malware Detection and Classification

Malware often uses obfuscation techniques or is modified slightly to eva...
research
12/17/2018

Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Measuring the similarity of two files is an important task in malware an...
research
12/16/2021

Revisiting Fuzzy Signatures: Towards a More Risk-Free Cryptographic Authentication System based on Biometrics

Biometric authentication is one of the promising alternatives to standar...
research
11/25/2018

Poisoning Behavioral Malware Clustering

Clustering algorithms have become a popular tool in computer security to...
research
02/12/2018

BagMinHash - Minwise Hashing Algorithm for Weighted Sets

Minwise hashing has become a standard tool to calculate signatures which...
research
09/04/2017

Lattice Operations on Terms over Similar Signatures

Unification and generalization are operations on two terms computing res...

Please sign up or login with your details

Forgot password? Click here to reset