Efficient Constrained Pattern Mining Using Dynamic Item Ordering for Explainable Classification
Learning of interpretable classification models has been attracting much attention for the last few years. Discovery of succinct and contrasting patterns that can highlight the differences between the two classes is very important. Such patterns are useful for human experts, and can be used to construct powerful classifiers. In this paper, we consider mining of minimal emerging patterns from high-dimensional data sets under a variety of constraints in a supervised setting. We focus on an extension in which patterns can contain negative items that designate the absence of an item. In such a case, a database becomes highly dense, and it makes mining more challenging since popular pattern mining techniques such as fp-tree and occurrence deliver do not efficiently work. To cope with this difficulty, we present an efficient algorithm for mining minimal emerging patterns by combining two techniques: dynamic variable-ordering during pattern search for enhancing pruning effect, and the use of a pointer-based dynamic data structure, called dancing links, for efficiently maintaining occurrence lists. Experiments on benchmark data sets showed that our algorithm achieves significant speed-ups over emerging pattern mining approach based on LCM, a very fast depth-first frequent itemset miner using static variable-ordering.
READ FULL TEXT