Logram: Efficient Log Parsing Using n-Gram Dictionaries

01/07/2020
by   Hetong Dai, et al.
0

Software systems usually record important runtime information in their logs. Logs help practitioners understand system runtime behaviors and diagnose field failures. As logs are usually very large in size, automated log analysis is needed to assist practitioners in their software operation and maintenance efforts. Typically, the first step of automated log analysis is log parsing, i.e., converting unstructured raw logs into structured data. However, log parsing is challenging, because logs are produced by static templates in the source code (i.e., logging statements) yet the templates are usually inaccessible when parsing logs. Prior work proposed automated log parsing approaches that have achieved high accuracy. However, as the volume of logs grows rapidly in the era of cloud computing, efficiency becomes a major concern in log parsing. In this work, we propose an automated log parsing approach, Logram, which leverages n-gram dictionaries to achieve efficient log parsing. We evaluated Logram on 16 public log datasets and compared Logram with five state-of-the-art log parsing approaches. We found that Logram achieves a similar parsing accuracy to the best existing approaches while outperforms these approaches in efficiency (i.e., 1.8 to 5.1 times faster than the second fastest approaches). Furthermore, we deployed Logram on Spark and we found that Logram scales out efficiently with the number of Spark nodes (e.g., with near-linear scalability) without sacrificing parsing accuracy. In addition, we demonstrated that Logram can support effective online parsing of logs, achieving similar parsing results and efficiency with the offline mode.

READ FULL TEXT

page 2

page 4

page 5

page 6

page 8

page 9

page 11

page 12

research
11/08/2018

Tools and Benchmarks for Automated Log Parsing

Logs are imperative in the development and maintenance process of many s...
research
09/15/2020

A Survey on Automated Log Analysis for Reliability Engineering

Logs are semi-structured text generated by logging statements in softwar...
research
08/14/2023

Hue: A User-Adaptive Parser for Hybrid Logs

Log parsing, which extracts log templates from semi-structured logs and ...
research
08/10/2022

LogStamp: Automatic Online Log Parsing Based on Sequence Labelling

Logs are one of the most critical data for service management. It contai...
research
10/29/2021

AWSOM-LP: An Effective Log Parsing Technique Using Pattern Recognition and Frequency Analysis

Logs provide users with useful insights to help with a variety of develo...
research
12/29/2022

System Log Parsing: A Survey

Modern information and communication systems have become increasingly ch...
research
04/22/2023

Did We Miss Something Important? Studying and Exploring Variable-Aware Log Abstraction

Due to the sheer size of software logs, developers rely on automated tec...

Please sign up or login with your details

Forgot password? Click here to reset