Exploring Unsupervised Learning Methods for Automated Protocol Analysis

11/17/2021
by   Arijit Dasgupta, et al.
0

The ability to analyse and differentiate network protocol traffic is crucial for network resource management to provide differentiated services by Telcos. Automated Protocol Analysis (APA) is crucial to significantly improve efficiency and reduce reliance on human experts. There are numerous automated state-of-the-art unsupervised methods for clustering unknown protocols in APA. However, many such methods have not been sufficiently explored using diverse test datasets. Thus failing to demonstrate their robustness to generalise. This study proposed a comprehensive framework to evaluate various combinations of feature extraction and clustering methods in APA. It also proposed a novel approach to automate selection of dataset dependent model parameters for feature extraction, resulting in improved performance. Promising results of a novel field-based tokenisation approach also led to our proposal of a novel automated hybrid approach for feature extraction and clustering of unknown protocols in APA. Our proposed hybrid approach performed the best in 7 out of 9 of the diverse test datasets, thus displaying the robustness to generalise across diverse unknown protocols. It also outperformed the unsupervised clustering technique in state-of-the-art open-source APA tool, NETZOB in all test datasets.

READ FULL TEXT
research
03/09/2022

Automatic Language Identification for Celtic Texts

Language identification is an important Natural Language Processing task...
research
02/20/2017

Developing a comprehensive framework for multimodal feature extraction

Feature extraction is a critical component of many applied data science ...
research
05/05/2021

Improved feature extraction for CRNN-based multiple sound source localization

In this work, we propose to extend a state-of-the-art multi-source local...
research
04/21/2022

A Novel Scalable Apache Spark Based Feature Extraction Approaches for Huge Protein Sequence and their Clustering Performance Analysis

Genome sequencing projects are rapidly increasing the number of high-dim...
research
08/08/2022

DeepTLS: comprehensive and high-performance feature extraction for encrypted traffic

Feature extraction is critical for TLS traffic analysis using machine le...
research
08/18/2020

RTFN: Robust Temporal Feature Network

Time series analysis plays a vital role in various applications, for ins...
research
09/06/2022

You Are What You Use: Usage-based Profiling in IoT Environments

Habit extraction is essential to automate services and provide appliance...

Please sign up or login with your details

Forgot password? Click here to reset