ONCE and ONCE+: Counting the Frequency of Time-constrained Serial Episodes in a Streaming Sequence

01/29/2018
by   Hui Li, et al.
0

As a representative sequential pattern mining problem, counting the frequency of serial episodes from a streaming sequence has drawn continuous attention in academia due to its wide application in practice, e.g., telecommunication alarms, stock market, transaction logs, bioinformatics, etc. Although a number of serial episodes mining algorithms have been developed recently, most of them are neither stream-oriented, as they require multi-pass of dataset, nor time-aware, as they fail to take into account the time constraint of serial episodes. In this paper, we propose two novel one-pass algorithms, ONCE and ONCE+, each of which can respectively compute two popular frequencies of given episodes satisfying predefined time-constraint as signals in a stream arrives one-after-another. ONCE is only used for non-overlapped frequency where the occurrences of a serial episode in sequence are not intersected. ONCE+ is designed for the distinct frequency where the occurrences of a serial episode do not share any event. Theoretical study proves that our algorithm can correctly mine the frequency of target time constraint serial episodes in a given stream. Experimental study over both real-world and synthetic datasets demonstrates that the proposed algorithm can work, with little time and space, in signal-intensive streams where millions of signals arrive within a single second. Moreover, the algorithm has been applied in a real stream processing system, where the efficacy and efficiency of this work is tested in practical applications.

READ FULL TEXT
research
03/27/2022

Approximately Counting Subgraphs in Data Streams

Estimating the number of subgraphs in data streams is a fundamental prob...
research
07/27/2020

Improved 3-pass Algorithm for Counting 4-cycles in Arbitrary Order Streaming

The problem of counting small subgraphs, and specifically cycles, in the...
research
03/17/2022

Triangle and Four Cycle Counting with Predictions in Graph Streams

We propose data-driven one-pass streaming algorithms for estimating the ...
research
12/07/2019

Flattened Exponential Histogram for Sliding Window Queries over Data Streams

The Basic Counting problem [1] is one of the most fundamental and critic...
research
05/03/2021

Model Counting meets F0 Estimation

Constraint satisfaction problems (CSP's) and data stream models are two ...
research
01/07/2019

Approximate-Closed-Itemset Mining for Streaming Data Under Resource Constraint

Here, we present a novel algorithm for frequent itemset mining for strea...
research
02/17/2008

Compressed Counting

Counting is among the most fundamental operations in computing. For exam...

Please sign up or login with your details

Forgot password? Click here to reset