Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

01/17/2020
by   Antoine Dedieu, et al.
8

We consider a discrete optimization based approach for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) ℓ_0-regularized problems at scales much larger than what was conventionally considered possible in the statistics and machine learning communities. Despite their usefulness, MIP-based approaches are significantly slower compared to relatively mature algorithms based on ℓ_1-regularization and relatives. We aim to bridge this computational gap by developing new MIP-based algorithms for ℓ_0-regularized classification. We propose two classes of scalable algorithms: an exact algorithm that can handle p≈ 50,000 features in a few minutes, and approximate algorithms that can address instances with p≈ 10^6 in times comparable to fast ℓ_1-based algorithms. Our exact algorithm is based on the novel idea of integrality generation, which solves the original problem (with p binary variables) via a sequence of mixed integer programs that involve a small number of binary variables. Our approximate algorithms are based on coordinate descent and local combinatorial search. In addition, we present new estimation error bounds for a class of ℓ_0-regularized estimators. Experiments on real and synthetic data demonstrate that our approach leads to models with considerably improved statistical performance (especially, variable selection) when compared to competing toolkits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2021

Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives

We present a new algorithmic framework for grouped variable selection th...
research
03/05/2018

Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms

We consider the canonical L_0-regularized least squares problem (aka bes...
research
09/23/2021

Sparse PCA: A New Scalable Estimator Based On Integer Programming

We consider the Sparse Principal Component Analysis (SPCA) problem under...
research
04/17/2020

Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization

We consider the least squares regression problem, penalized with a combi...
research
07/18/2023

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

We consider the problem of learning a sparse graph underlying an undirec...
research
07/15/2021

Learning Mixed-Integer Linear Programs from Contextual Examples

Mixed-integer linear programs (MILPs) are widely used in artificial inte...
research
08/10/2020

SWITSS: Computing Small Witnessing Subsystems

Witnessing subsystems for probabilistic reachability thresholds in discr...

Please sign up or login with your details

Forgot password? Click here to reset