COMBSS: Best Subset Selection via Continuous Optimization

05/05/2022
by   Sarat Moka, et al.
0

We consider the problem of best subset selection in linear regression, where the goal is to find for every model size k, that subset of k features that best fit the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. We propose COMBSS, a novel continuous optimization based method that identifies a solution path, a small set of models of varying size, that consists of candidates for the best subset in linear regression. COMBSS turns out to be very fast, making subset selection possible when the number of features is well in excess of thousands. Simulation results are presented to highlight the performance of COMBSS in comparison to existing popular methods such as Forward Stepwise, the Lasso and Mixed-Integer Optimization. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset