Malware Analysis with Symbolic Execution and Graph Kernel

Malware analysis techniques are divided into static and dynamic analysis. Both techniques can be bypassed by circumvention techniques such as obfuscation. In a series of works, the authors have promoted the use of symbolic executions combined with machine learning to avoid such traps. Most of those works rely on natural graph-based representations that can then be plugged into graph-based learning algorithms such as Gspan. There are two main problems with this approach. The first one is in the cost of computing the graph. Indeed, working with graphs requires one to compute and representing the entire state-space of the file under analysis. As such computation is too cumbersome, the techniques often rely on developing strategies to compute a representative subgraph of the behaviors. Unfortunately, efficient graph-building strategies remain weakly explored. The second problem is in the classification itself. Graph-based machine learning algorithms rely on comparing the biggest common structures. This sidelines small but specific parts of the malware signature. In addition, it does not allow us to work with efficient algorithms such as support vector machine. We propose a new efficient open source toolchain for machine learning-based classification. We also explore how graph-kernel techniques can be used in the process. We focus on the 1-dimensional Weisfeiler-Lehman kernel, which can capture local similarities between graphs. Our experimental results show that our approach outperforms existing ones by an impressive factor.

READ FULL TEXT

page 16

page 17

page 18

research
12/05/2021

Using Static and Dynamic Malware features to perform Malware Ascription

Malware ascription is a relatively unexplored area, and it is rather dif...
research
11/30/2021

New Datasets for Dynamic Malware Classification

Nowadays, malware and malware incidents are increasing daily, even with ...
research
03/28/2023

A Survey on Malware Detection with Graph Representation Learning

Malware detection has become a major concern due to the increasing numbe...
research
07/27/2018

Leveraging Support Vector Machine for Opcode Density Based Detection of Crypto-Ransomware

Ransomware is a significant global threat, with easy deployment due to t...
research
09/04/2019

Defeating Opaque Predicates Statically through Machine Learning and Binary Analysis

We present a new approach that bridges binary analysis techniques with m...
research
03/22/2023

A Comparison of Graph Neural Networks for Malware Classification

Managing the threat posed by malware requires accurate detection and cla...

Please sign up or login with your details

Forgot password? Click here to reset