Boosting Frequent Itemset Mining via Early Stopping Intersections

by   Huu Hiep Nguyen, et al.
Duy Tan University

Mining frequent itemsets from a transaction database has emerged as a fundamental problem in data mining and committed itself as a building block for many pattern mining tasks. In this paper, we present a general technique to reduce support checking time in existing depth-first search generate-and-test schemes such as Eclat/dEclat and PrePost+. Our technique allows infrequent candidate itemsets to be detected early. The technique is based on an early-stopping criterion and is general enough to be applicable in many frequent itemset mining algorithms. We have applied the technique to two TID-list based schemes (Eclat/dEclat) and one N-list based scheme (PrePost+). Our technique has been tested over a variety of datasets and confirmed its effectiveness in runtime reduction.


page 8

page 9


Abstract Representations and Frequent Pattern Discovery

We discuss the frequent pattern mining problem in a general setting. Fro...

Mining All Non-Derivable Frequent Itemsets

Recent studies on frequent itemset mining algorithms resulted in signifi...

Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms

Frequent itemset mining is a popular data mining technique. Apriori, Ecl...

Analyzing Large-Scale, Distributed and Uncertain Data

The exponential growth of data in current times and the demand to gain i...

Characterizing Transactional Databases for Frequent Itemset Mining

This paper presents a study of the characteristics of transactional data...

cgSpan: Closed Graph-Based Substructure Pattern Mining

gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (clos...

Approximate Network Motif Mining Via Graph Learning

Frequent and structurally related subgraphs, also known as network motif...

Please sign up or login with your details

Forgot password? Click here to reset