Killing Two Birds with One Stone: Malicious Domain Detection with High Accuracy and Coverage

by   Issa Khalil, et al.

Inference based techniques are one of the major approaches to analyze DNS data and detecting malicious domains. The key idea of inference techniques is to first define associations between domains based on features extracted from DNS data. Then, an inference algorithm is deployed to infer potential malicious domains based on their direct/indirect associations with known malicious ones. The way associations are defined is key to the effectiveness of an inference technique. It is desirable to be both accurate (i.e., avoid falsely associating domains with no meaningful connections) and with good coverage (i.e., identify all associations between domains with meaningful connections). Due to the limited scope of information provided by DNS data, it becomes a challenge to design an association scheme that achieves both high accuracy and good coverage. In this paper, we propose a new association scheme to identify domains controlled by the same entity. Our key idea is an in-depth analysis of active DNS data to accurately separate public IPs from dedicated ones, which enables us to build high-quality associations between domains. Our scheme identifies many meaningful connections between domains that are discarded by existing state-of-the-art approaches. Our experimental results show that the proposed association scheme not only significantly improves the domain coverage compared to existing approaches but also achieves better detection accuracy. Existing path-based inference algorithm is specifically designed for DNS data analysis. It is effective but computationally expensive. As a solution, we investigate the effectiveness of combining our association scheme with the generic belief propagation algorithm. Through comprehensive experiments, we show that this approach offers significant efficiency and scalability improvement with only minor negative impact of detection accuracy.


page 1

page 9


A Survey on Malicious Domains Detection through DNS Data Analysis

Malicious domains are one of the major resources required for adversarie...

DeviceWatch: Identifying Compromised Mobile Devices through Network Traffic Analysis and Graph Inference

In this paper, we propose to identify compromised mobile devices from a ...

Discovering Association with Copula Entropy

Discovering associations is of central importance in scientific practice...

Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach

We introduce a novel one-stage end-to-end multi-person 2D pose estimatio...

The Minor Fall, the Major Lift: Inferring Emotional Valence of Musical Chords through Lyrics

We investigate the association between musical chords and lyrics by anal...

HAM: Hybrid Associations Model with Pooling for Sequential Recommendation

We developed a hybrid associations model (HAM) to generate sequential re...

Domain-Embeddings Based DGA Detection with Incremental Training Method

DGA-based botnet, which uses Domain Generation Algorithms (DGAs) to evad...

Please sign up or login with your details

Forgot password? Click here to reset