NeuDep: Neural Binary Memory Dependence Analysis

10/04/2022
by   Kexin Pei, et al.
0

Determining whether multiple instructions can access the same memory location is a critical task in binary analysis. It is challenging as statically computing precise alias information is undecidable in theory. The problem aggravates at the binary level due to the presence of compiler optimizations and the absence of symbols and types. Existing approaches either produce significant spurious dependencies due to conservative analysis or scale poorly to complex binaries. We present a new machine-learning-based approach to predict memory dependencies by exploiting the model's learned knowledge about how binary programs execute. Our approach features (i) a self-supervised procedure that pretrains a neural net to reason over binary code and its dynamic value flows through memory addresses, followed by (ii) supervised finetuning to infer the memory dependencies statically. To facilitate efficient learning, we develop dedicated neural architectures to encode the heterogeneous inputs (i.e., code, data values, and memory addresses from traces) with specific modules and fuse them with a composition learning strategy. We implement our approach in NeuDep and evaluate it on 41 popular software projects compiled by 2 compilers, 4 optimizations, and 4 obfuscation passes. We demonstrate that NeuDep is more precise (1.5x) and faster (3.5x) than the current state-of-the-art. Extensive probing studies on security-critical reverse engineering tasks suggest that NeuDep understands memory access patterns, learns function signatures, and is able to match indirect calls. All these tasks either assist or benefit from inferring memory dependencies. Notably, NeuDep also outperforms the current state-of-the-art on these tasks.

READ FULL TEXT
research
10/02/2020

XDA: Accurate, Robust Disassembly with Transfer Learning

Accurate and robust disassembly of stripped binaries is challenging. The...
research
12/16/2020

Trex: Learning Execution Semantics from Micro-Traces for Binary Similarity

Detecting semantically similar functions – a crucial analysis capability...
research
06/10/2021

Semantic-aware Binary Code Representation with BERT

A wide range of binary analysis applications, such as bug discovery, mal...
research
10/31/2022

Unsafe's Betrayal: Abusing Unsafe Rust in Binary Reverse Engineering via Machine Learning

Memory-safety bugs introduce critical software-security issues. Rust pro...
research
09/14/2022

Cornucopia: A Framework for Feedback Guided Generation of Binaries

Binary analysis is an important capability required for many security an...
research
01/06/2023

CFG2VEC: Hierarchical Graph Neural Network for Cross-Architectural Software Reverse Engineering

Mission-critical embedded software is critical to our society's infrastr...
research
12/18/2019

Binsec/Rel: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level

The constant-time programming discipline (CT) is an efficient countermea...

Please sign up or login with your details

Forgot password? Click here to reset