An Efficient Post-Selection Inference on High-Order Interaction Models

06/26/2015
by   S. Suzumura, et al.
0

Finding statistically significant high-order interaction features in predictive modeling is important but challenging task. The difficulty lies in the fact that, for a recent applications with high-dimensional covariates, the number of possible high-order interaction features would be extremely large. Identifying statistically significant features from such a huge pool of candidates would be highly challenging both in computational and statistical senses. To work with this problem, we consider a two stage algorithm where we first select a set of high-order interaction features by marginal screening, and then make statistical inferences on the regression model fitted only with the selected features. Such statistical inferences are called post-selection inference (PSI), and receiving an increasing attention in the literature. One of the seminal recent advancements in PSI literature is the works by Lee et al. where the authors presented an algorithmic framework for computing exact sampling distributions in PSI. A main challenge when applying their approach to our high-order interaction models is to cope with the fact that PSI in general depends not only on the selected features but also on the unselected features, making it hard to apply to our extremely high-dimensional high-order interaction models. The goal of this paper is to overcome this difficulty by introducing a novel efficient method for PSI. Our key idea is to exploit the underlying tree structure among high-order interaction features, and to develop a pruning method of the tree which enables us to quickly identify a group of unselected features that are guaranteed to have no influence on PSI. The experimental results indicate that the proposed method allows us to reliably identify statistically significant high-order interaction features with reasonable computational cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2015

Safe Feature Pruning for Sparse High-Order Interaction Models

Taking into account high-order interactions among covariates is valuable...
research
06/09/2021

Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

Automated high-stake decision-making such as medical diagnosis requires ...
research
02/15/2016

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Discovering statistically significant patterns from databases is an impo...
research
02/23/2021

Learning High-Order Interactions via Targeted Pattern Search

Logistic Regression (LR) is a widely used statistical method in empirica...
research
09/18/2007

Bayesian Classification and Regression with High Dimensional Features

This thesis responds to the challenges of using a large number, such as ...
research
10/08/2015

Texture Modelling with Nested High-order Markov-Gibbs Random Fields

Currently, Markov-Gibbs random field (MGRF) image models which include h...
research
12/17/2022

Joint Information Extraction with Cross-Task and Cross-Instance High-Order Modeling

Prior works on Information Extraction (IE) typically predict different t...

Please sign up or login with your details

Forgot password? Click here to reset