From Tempered to Benign Overfitting in ReLU Neural Networks

05/24/2023
by   Guy Kornowski, et al.
0

Overparameterized neural networks (NNs) are observed to generalize well even when trained to perfectly fit noisy data. This phenomenon motivated a large body of work on "benign overfitting", where interpolating predictors achieve near-optimal performance. Recently, it was conjectured and empirically observed that the behavior of NNs is often better described as "tempered overfitting", where the performance is non-optimal yet also non-trivial, and degrades as a function of the noise level. However, a theoretical justification of this claim for non-linear NNs has been lacking so far. In this work, we provide several results that aim at bridging these complementing views. We study a simple classification setting with 2-layer ReLU NNs, and prove that under various assumptions, the type of overfitting transitions from tempered in the extreme case of one-dimensional data, to benign in high dimensions. Thus, we show that the input dimension has a crucial role on the type of overfitting in this setting, which we also validate empirically for intermediate dimensions. Overall, our results shed light on the intricate connections between the dimension, sample size, architecture and training algorithm on the one hand, and the type of resulting overfitting on the other hand.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

The success of over-parameterized neural networks trained to near-zero t...
research
03/07/2023

Benign Overfitting for Two-layer ReLU Networks

Modern deep learning models with great expressive power can be trained t...
research
07/14/2022

Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting

The practical success of overparameterized neural networks has motivated...
research
02/14/2022

Benign Overfitting in Two-layer Convolutional Neural Networks

Modern neural networks often have great expressive power and can be trai...
research
07/28/2023

Noisy Interpolation Learning with Shallow Univariate ReLU Networks

We study the asymptotic overfitting behavior of interpolation with minim...
research
06/11/2020

A new measure for overfitting and its implications for backdooring of deep learning

Overfitting describes the phenomenon that a machine learning model fits ...
research
05/08/2020

A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods

In this work we demonstrate provable guarantees on the training of depth...

Please sign up or login with your details

Forgot password? Click here to reset