Imbalanced Malware Images Classification: a CNN based Approach

08/27/2017
by   Songqing Yue, et al.
0

Deep convolutional neural networks (CNNs) can be applied to malware binary detection through images classification. The performance, however, is degraded due to the imbalance of malware families (classes). To mitigate this issue, we propose a simple yet effective weighted softmax loss which can be employed as the final layer of deep CNNs. The original softmax loss is weighted, and the weight value can be determined according to class size. A scaling parameter is also included in computing the weight. Proper selection of this parameter has been studied and an empirical option is given. The weighted loss aims at alleviating the impact of data imbalance in an end-to-end learning fashion. To validate the efficacy, we deploy the proposed weighted loss in a pre-trained deep CNN model and fine-tune it to achieve promising results on malware images classification. Extensive experiments also indicate that the new loss function can fit other typical CNNs with an improved classification performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset