Selective Inference Approach for Statistically Sound Predictive Pattern Mining

by   Shinya Suzumura, et al.
Nagoya Institute of Technology
Osaka University
The University of Tokyo

Discovering statistically significant patterns from databases is an important challenging problem. The main obstacle of this problem is in the difficulty of taking into account the selection bias, i.e., the bias arising from the fact that patterns are selected from extremely large number of candidates in databases. In this paper, we introduce a new approach for predictive pattern mining problems that can address the selection bias issue. Our approach is built on a recently popularized statistical inference framework called selective inference. In selective inference, statistical inferences (such as statistical hypothesis testing) are conducted based on sampling distributions conditional on a selection event. If the selection event is characterized in a tractable way, statistical inferences can be made without minding selection bias issue. However, in pattern mining problems, it is difficult to characterize the entire selection process of mining algorithms. Our main contribution in this paper is to solve this challenging problem for a class of predictive pattern mining problems by introducing a novel algorithmic framework. We demonstrate that our approach is useful for finding statistically significant patterns from databases.


page 3

page 8


Bounded P-values in Parametric Programming-based Selective Inference

Selective inference (SI) has been actively studied as a promising framew...

An Efficient Post-Selection Inference on High-Order Interaction Models

Finding statistically significant high-order interaction features in pre...

Searching for significant patterns in stratified data

Significant pattern mining, the problem of finding itemsets that are sig...

Statistically Significant Pattern Mining with Ordinal Utility

Statistically significant patterns mining (SSPM) is an essential and cha...

Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining

In this paper we study predictive pattern mining problems where the goal...

Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

Automated high-stake decision-making such as medical diagnosis requires ...

A primer on statistically validated networks

In this contribution we discuss some approaches of network analysis prov...

Please sign up or login with your details

Forgot password? Click here to reset