Reluctant Interaction Modeling

07/19/2019
by   Guo Yu, et al.
0

Including pairwise interactions between the predictors of a regression model can produce better predicting models. However, to fit such interaction models on typical data sets in biology and other fields can often require solving enormous variable selection problems with billions of interactions. The scale of such problems demands methods that are computationally cheap (both in time and memory) yet still have sound statistical properties. Motivated by these large-scale problem sizes, we adopt a very simple guiding principle: One should prefer a main effect over an interaction if all else is equal. This "reluctance" to interactions, while reminiscent of the hierarchy principle for interactions, is much less restrictive. We design a computationally efficient method built upon this principle and provide theoretical results indicating favorable statistical properties. Empirical results show dramatic computational improvement without sacrificing statistical properties. For example, the proposed method can solve a problem with 10 billion interactions with 5-fold cross-validation in under 7 hours on a single CPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2019

Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

In many learning settings, it is beneficial to augment the main features...
research
12/18/2018

A robust estimation for the extended t-process regression model

Robust estimation and variable selection procedure are developed for the...
research
07/28/2014

Efficient Regularized Regression for Variable Selection with L0 Penalty

Variable (feature, gene, model, which we use interchangeably) selections...
research
03/24/2021

A scalable hierarchical lasso for gene-environment interactions

We describe a regularized regression model for the selection of gene-env...
research
10/14/2022

Variable Importance Based Interaction Modeling with an Application on Initial Spread of COVID-19 in China

Interaction selection for linear regression models with both continuous ...
research
03/24/2020

Interaction Pursuit Biconvex Optimization

Multivariate regression models are widely used in various fields such as...
research
11/24/2020

A Self-Penalizing Objective Function for Scalable Interaction Detection

We tackle the problem of nonparametric variable selection with a focus o...

Please sign up or login with your details

Forgot password? Click here to reset