The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks

08/25/2021
by   Niladri S. Chatterji, et al.
0

The recent success of neural network models has shone light on a rather surprising statistical phenomenon: statistical models that perfectly fit noisy data can generalize well to unseen test data. Understanding this phenomenon of benign overfitting has attracted intense theoretical and empirical study. In this paper, we consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk when the covariates satisfy sub-Gaussianity and anti-concentration properties, and the noise is independent and sub-Gaussian. By leveraging recent results that characterize the implicit bias of this estimator, our bounds emphasize the role of both the quality of the initialization as well as the properties of the data covariance matrix in achieving low excess risk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

Benign overfitting, the phenomenon where interpolating models generalize...
research
01/10/2020

Data-Dependence of Plateau Phenomenon in Learning with Neural Network — Statistical Mechanical Analysis

The plateau phenomenon, wherein the loss value stops decreasing during t...
research
06/01/2018

The Nonlinearity Coefficient - Predicting Overfitting in Deep Neural Networks

For a long time, designing neural architectures that exhibit high perfor...
research
03/11/2022

A geometrical viewpoint on the benign overfitting property of the minimum l_2-norm interpolant estimator

Practitioners have observed that some deep learning models generalize we...
research
05/13/2021

On the Explicit Role of Initialization on the Convergence and Implicit Bias of Overparametrized Linear Networks

Neural networks trained via gradient descent with random initialization ...
research
03/02/2023

Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization

Linear classifiers and leaky ReLU networks trained by gradient flow on t...
research
05/09/2023

Testing for Overfitting

High complexity models are notorious in machine learning for overfitting...

Please sign up or login with your details

Forgot password? Click here to reset