To believe or not to believe: Validating explanation fidelity for dynamic malware analysis

by   Li Chen, et al.

Converting malware into images followed by vision-based deep learning algorithms has shown superior threat detection efficacy compared with classical machine learning algorithms. When malware are visualized as images, visual-based interpretation schemes can also be applied to extract insights of why individual samples are classified as malicious. In this work, via two case studies of dynamic malware classification, we extend the local interpretable model-agnostic explanation algorithm to explain image-based dynamic malware classification and examine its interpretation fidelity. For both case studies, we first train deep learning models via transfer learning on malware images, demonstrate high classification effectiveness, apply an explanation method on the images, and correlate the results back to the samples to validate whether the algorithmic insights are consistent with security domain expertise. In our first case study, the interpretation framework identifies indirect calls that uniquely characterize the underlying exploit behavior of a malware family. In our second case study, the interpretation framework extracts insightful information such as cryptography-related APIs when applied on images created from API existence, but generate ambiguous interpretation on images created from API sequences and frequencies. Our findings indicate that current image-based interpretation techniques are promising for explaining vision-based malware classification. We continue to develop image-based interpretation schemes specifically for security applications.


Understanding the efficacy, reliability and resiliency of computer vision techniques for malware detection and future research directions

My research lies in the intersection of security and machine learning. T...

Deep Transfer Learning for Static Malware Classification

We propose to apply deep transfer learning from computer vision to stati...

Towards Interpretable Ensemble Learning for Image-based Malware Detection

Deep learning (DL) models for image-based malware detection have exhibit...

CNN vs ELM for Image-Based Malware Classification

Research in the field of malware classification often relies on machine ...

Dynamic Analysis of Executables to Detect and Characterize Malware

It is needed to ensure the integrity of systems that process sensitive i...

Interpreting GNN-based IDS Detections Using Provenance Graph Structural Features

The black-box nature of complex Neural Network (NN)-based models has hin...

Emulating malware authors for proactive protection using GANs over a distributed image visualization of the dynamic file behavior

Malware authors have always been at an advantage of being able to advers...

Please sign up or login with your details

Forgot password? Click here to reset