Topological Data Analysis for Anomaly Detection in Host-Based Logs
Topological Data Analysis (TDA) gives practioners the ability to analyse the global structure of cybersecurity data. We use TDA for anomaly detection in host-based logs collected with the open-source Logging Made Easy (LME) project. We present an approach that builds a filtration of simplicial complexes directly from Windows logs, enabling analysis of their intrinsic structure using topological tools. We compare the efficacy of persistent homology and the spectrum of graph and hypergraph Laplacians as feature vectors against a standard log embedding that counts events, and find that topological and spectral embeddings of computer logs contain discriminative information for classifying anomalous logs that is complementary to standard embeddings. We end by discussing the potential for our methods to be used as part of an explainable framework for anomaly detection.
READ FULL TEXT