Mining Compressed Repetitive Gapped Sequential Patterns Efficiently

06/04/2009
by   Yongxin Tong, et al.
0

Mining frequent sequential patterns from sequence databases has been a central research topic in data mining and various efficient mining sequential patterns algorithms have been proposed and studied. Recently, in many problem domains (e.g, program execution traces), a novel sequential pattern mining research, called mining repetitive gapped sequential patterns, has attracted the attention of many researchers, considering not only the repetition of sequential pattern in different sequences but also the repetition within a sequence is more meaningful than the general sequential pattern mining which only captures occurrences in different sequences. However, the number of repetitive gapped sequential patterns generated by even these closed mining algorithms may be too large to understand for users, especially when support threshold is low. In this paper, we propose and study the problem of compressing repetitive gapped sequential patterns. Inspired by the ideas of summarizing frequent itemsets, RPglobal, we develop an algorithm, CRGSgrow (Compressing Repetitive Gapped Sequential pattern grow), including an efficient pruning strategy, SyncScan, and an efficient representative pattern checking scheme, -dominate sequential pattern checking. The CRGSgrow is a two-step approach: in the first step, we obtain all closed repetitive sequential patterns as the candidate set of representative repetitive sequential patterns, and at the same time get the most of representative repetitive sequential patterns; in the second step, we only spend a little time in finding the remaining the representative patterns from the candidate set. An empirical study with both real and synthetic data sets clearly shows that the CRGSgrow has good performance.

READ FULL TEXT
research
04/24/2023

Towards Top-K Non-Overlapping Sequential Patterns

Sequential pattern mining (SPM) has excellent prospects and application ...
research
02/07/2019

The Long and the Short of It: Summarising Event Sequences with Serial Episodes

An ideal outcome of pattern mining is a small set of informative pattern...
research
12/12/2017

Mining Non-Redundant Sets of Generalizing Patterns from Sequence Databases

Sequential pattern mining techniques extract patterns corresponding to f...
research
01/27/2022

Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences

The need to analyze information from streams arises in a variety of appl...
research
10/29/2020

Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union

Given a set of sequences comprised of time-ordered events, sequential pa...
research
11/09/2020

Characterizing Transactional Databases for Frequent Itemset Mining

This paper presents a study of the characteristics of transactional data...
research
08/06/2018

Know Abnormal, Find Evil: Frequent Pattern Mining for Ransomware Threat Hunting and Intelligence

Emergence of crypto-ransomware has significantly changed the cyber threa...

Please sign up or login with your details

Forgot password? Click here to reset