Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor Problem

09/12/2022
by   Breno Serrano, et al.
0

We study the feature-based newsvendor problem, in which a decision-maker has access to historical data consisting of demand observations and exogenous features. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. Up to now, state-of-the-art methods utilize regularization, which penalizes the number of selected features or the norm of the solution vector. As an alternative, we introduce a novel bilevel programming formulation. The upper-level problem selects a subset of features that minimizes an estimate of the out-of-sample cost of ordering decisions based on a held-out validation set. The lower-level problem learns the optimal coefficients of the decision function on a training set, using only the features selected by the upper-level. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers. Our computational experiments show that the method accurately recovers ground-truth features already for instances with a sample size of a few hundred observations. In contrast, regularization-based techniques often fail at feature recovery or require thousands of observations to obtain similar accuracy. Regarding out-of-sample generalization, we achieve improved or comparable cost performance.

READ FULL TEXT
research
10/08/2020

Robust Multi-class Feature Selection via l_2,0-Norm Regularization Minimization

Feature selection is an important data preprocessing in data mining and ...
research
02/19/2019

An entropic feature selection method in perspective of Turing formula

Health data are generally complex in type and small in sample size. Such...
research
05/28/2022

Feature subset selection for kernel SVM classification via mixed-integer optimization

We study the mixed-integer optimization (MIO) approach to feature subset...
research
02/14/2012

Generalized Fisher Score for Feature Selection

Fisher score is one of the most widely used supervised feature selection...
research
02/18/2019

Sparse Regression: Scalable algorithms and empirical performance

In this paper, we review state-of-the-art methods for feature selection ...
research
08/07/2018

Efficient and Effective L_0 Feature Selection

Because of continuous advances in mathematical programing, Mix Integer O...
research
07/12/2020

Simultaneous Feature Selection and Outlier Detection with Optimality Guarantees

Sparse estimation methods capable of tolerating outliers have been broad...

Please sign up or login with your details

Forgot password? Click here to reset