Deselection of Base-Learners for Statistical Boosting – with an Application to Distributional Regression

02/03/2022
by   Annika Strömer, et al.
0

We present a new procedure for enhanced variable selection for component-wise gradient boosting. Statistical boosting is a computational approach that emerged from machine learning, which allows to fit regression models in the presence of high-dimensional data. Furthermore, the algorithm can lead to data-driven variable selection. In practice, however, the final models typically tend to include too many variables in some situations. This occurs particularly for low-dimensional data (p<n), where we observe a slow overfitting behavior of boosting. As a result, more variables get included into the final model without altering the prediction accuracy. Many of these false positives are incorporated with a small coefficient and therefore have a small impact, but lead to a larger model. We try to overcome this issue by giving the algorithm the chance to deselect base-learners with minor importance. We analyze the impact of the new approach on variable selection and prediction performance in comparison to alternative methods including boosting with earlier stopping as well as twin boosting. We illustrate our approach with data of an ongoing cohort study for chronic kidney disease patients, where the most influential predictors for the health-related quality of life measure are selected in a distributional regression approach based on beta regression.

READ FULL TEXT
research
02/15/2017

Probing for sparse and fast variable selection with model-based boosting

We present a new variable selection method based on model-based gradient...
research
02/27/2017

An update on statistical boosting in biomedicine

Statistical boosting algorithms have triggered a lot of research during ...
research
02/06/2020

Robust Boosting for Regression Problems

The gradient boosting algorithm constructs a regression estimator using ...
research
05/04/2018

Selective Inference for L_2-Boosting

We review several recently proposed post-selection inference frameworks ...
research
05/04/2018

Valid Inference for L_2-Boosting

We review several recently proposed post-selection inference frameworks ...
research
12/08/2019

VAT tax gap prediction: a 2-steps Gradient Boosting approach

Tax evasion is the illegal non-payment of taxes by individuals, corporat...
research
09/09/2016

Boosting Joint Models for Longitudinal and Time-to-Event Data

Joint Models for longitudinal and time-to-event data have gained a lot o...

Please sign up or login with your details

Forgot password? Click here to reset