Safe Feature Pruning for Sparse High-Order Interaction Models

06/26/2015
by   Kazuya Nakagawa, et al.
0

Taking into account high-order interactions among covariates is valuable in many practical regression problems. This is, however, computationally challenging task because the number of high-order interaction features to be considered would be extremely large unless the number of covariates is sufficiently small. In this paper, we propose a novel efficient algorithm for LASSO-based sparse learning of such high-order interaction models. Our basic strategy for reducing the number of features is to employ the idea of recently proposed safe feature screening (SFS) rule. An SFS rule has a property that, if a feature satisfies the rule, then the feature is guaranteed to be non-active in the LASSO solution, meaning that it can be safely screened-out prior to the LASSO training process. If a large number of features can be screened-out before training the LASSO, the computational cost and the memory requirment can be dramatically reduced. However, applying such an SFS rule to each of the extremely large number of high-order interaction features would be computationally infeasible. Our key idea for solving this computational issue is to exploit the underlying tree structure among high-order interaction features. Specifically, we introduce a pruning condition called safe feature pruning (SFP) rule which has a property that, if the rule is satisfied in a certain node of the tree, then all the high-order interaction features corresponding to its descendant nodes can be guaranteed to be non-active at the optimal solution. Our algorithm is extremely efficient, making it possible to work, e.g., with 3rd order interactions of 10,000 original covariates, where the number of possible high-order interaction features is greater than 10^12.

READ FULL TEXT
research
06/26/2015

An Efficient Post-Selection Inference on High-Order Interaction Models

Finding statistically significant high-order interaction features in pre...
research
06/09/2021

Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

Automated high-stake decision-making such as medical diagnosis requires ...
research
02/15/2016

Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining

In this paper we study predictive pattern mining problems where the goal...
research
02/09/2020

Learning High Order Feature Interactions with Fine Control Kernels

We provide a methodology for learning sparse statistical models that use...
research
02/23/2021

Learning High-Order Interactions via Targeted Pattern Search

Logistic Regression (LR) is a widely used statistical method in empirica...
research
10/03/2018

Learning sparse optimal rule fit by safe screening

In this paper, we consider linear prediction models in the form of a spa...
research
02/23/2021

Provable Boolean Interaction Recovery from Tree Ensemble obtained via Random Forests

Random Forests (RF) are at the cutting edge of supervised machine learni...

Please sign up or login with your details

Forgot password? Click here to reset