Host-based anomaly detection using Eigentraces feature extraction and one-class classification on system call trace data
This paper proposes a methodology for host-based anomaly detection using a semi-supervised algorithm namely one-class classifier combined with a PCA-based feature extraction technique called Eigentraces on system call trace data. The one-class classification is based on generating a set of artificial data using a reference distribution and combining the target class probability function with artificial class density function to estimate the target class density function through the Bayes formulation. The benchmark dataset, ADFA-LD, is employed for the simulation study. ADFA-LD dataset contains thousands of system call traces collected during various normal and attack processes for the Linux operating system environment. In order to pre-process and to extract features, windowing on the system call trace data followed by the principal component analysis which is named as Eigentraces is implemented. The target class probability function is modeled separately by Radial Basis Function neural network and Random Forest machine learners for performance comparison purposes. The simulation study showed that the proposed intrusion detection system offers high performance for detecting anomalies and normal activities with respect to a set of well-accepted metrics including detection rate, accuracy, and missed and false alarm rates.
READ FULL TEXT