Gradient-based Regularization Parameter Selection for Problems with Non-smooth Penalty Functions

03/28/2017
by   Jean Feng, et al.
0

In high-dimensional and/or non-parametric regression problems, regularization (or penalization) is used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure corresponding to that penalty should be enforced. Typically the parameters are chosen to minimize the error on a separate validation set using a simple grid search or a gradient-free optimization method. It is more efficient to tune parameters if the gradient can be determined, but this is often difficult for problems with non-smooth penalty functions. Here we show that for many penalized regression problems, the validation loss is actually smooth almost-everywhere with respect to the penalty parameters. We can therefore apply a modified gradient descent algorithm to tune parameters. Through simulation studies on example regression problems, we find that increasing the number of penalty parameters and tuning them using our method can decrease the generalization error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2019

An analysis of the cost of hyper-parameter selection via split-sample validation, with applications to penalized regression

In the regression setting, given a set of hyper-parameters, a model-esti...
research
02/10/2023

On Penalty-based Bilevel Gradient Descent Method

Bilevel optimization enjoys a wide range of applications in hyper-parame...
research
02/20/2019

Cross Validation for Penalized Quantile Regression with a Case-Weight Adjusted Solution Path

Cross validation is widely used for selecting tuning parameters in regul...
research
06/06/2017

Shape Parameter Estimation

Performance of machine learning approaches depends strongly on the choic...
research
10/12/2018

Safe Grid Search with Optimal Complexity

Popular machine learning estimators involve regularization parameters th...
research
06/29/2023

Solving Kernel Ridge Regression with Gradient-Based Optimization Methods

Kernel ridge regression, KRR, is a non-linear generalization of linear r...
research
06/07/2020

What needles do sparse neural networks find in nonlinear haystacks

Using a sparsity inducing penalty in artificial neural networks (ANNs) a...

Please sign up or login with your details

Forgot password? Click here to reset