Implicit Regularization in ReLU Networks with the Square Loss

12/09/2020
by   Gal Vardi, et al.
0

Understanding the implicit regularization (or implicit bias) of gradient descent has recently been a very active research area. However, the implicit regularization in nonlinear neural networks is still poorly understood, especially for regression losses such as the square loss. Perhaps surprisingly, we prove that even for a single ReLU neuron, it is impossible to characterize the implicit regularization with the square loss by any explicit function of the model parameters (although on the positive side, we show it can be characterized approximately). For one hidden-layer networks, we prove a similar result, where in general it is impossible to characterize implicit regularization properties in this manner, except for the "balancedness" property identified in Du et al. [2018]. Our results suggest that a more general framework than the one considered so far may be needed to understand implicit regularization for nonlinear predictors, and provides some clues on what this framework should be.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

Support Vectors and Gradient Dynamics for Implicit Bias in ReLU Networks

Understanding implicit bias of gradient descent has been an important go...
research
01/30/2022

Implicit Regularization Towards Rank Minimization in ReLU Networks

We study the conjectured relationship between the implicit regularizatio...
research
10/11/2021

A global convergence theory for deep ReLU implicit networks via over-parameterization

Implicit deep learning has received increasing attention recently due to...
research
01/28/2022

Limitation of characterizing implicit regularization by data-independent functions

In recent years, understanding the implicit regularization of neural net...
research
08/04/2020

Shallow Univariate ReLu Networks as Splines: Initialization, Loss Surface, Hessian, Gradient Flow Dynamics

Understanding the learning dynamics and inductive bias of neural network...
research
05/16/2023

Deep ReLU Networks Have Surprisingly Simple Polytopes

A ReLU network is a piecewise linear function over polytopes. Figuring o...
research
08/03/2020

Implicit Regularization in Deep Learning: A View from Function Space

We approach the problem of implicit regularization in deep learning from...

Please sign up or login with your details

Forgot password? Click here to reset