Statistical Optimality of Deep Wide Neural Networks

05/04/2023
by   Yicheng Li, et al.
0

In this paper, we consider the generalization ability of deep wide feedforward ReLU neural networks defined on a bounded domain 𝒳⊂ℝ^d. We first demonstrate that the generalization ability of the neural network can be fully characterized by that of the corresponding deep neural tangent kernel (NTK) regression. We then investigate on the spectral properties of the deep NTK and show that the deep NTK is positive definite on 𝒳 and its eigenvalue decay rate is (d+1)/d. Thanks to the well established theories in kernel regression, we then conclude that multilayer wide neural networks trained by gradient descent with proper early stopping achieve the minimax rate, provided that the regression function lies in the reproducing kernel Hilbert space (RKHS) associated with the corresponding NTK. Finally, we illustrate that the overfitted multilayer wide neural networks can not generalize well on 𝕊^d.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2023

Generalization Ability of Wide Neural Networks on ℝ

We perform a study on the generalization ability of the wide two-layer R...
research
09/21/2020

Kernel-Based Smoothness Analysis of Residual Networks

A major factor in the success of deep neural networks is the use of soph...
research
05/29/2023

Generalization Ability of Wide Residual Networks

In this paper, we study the generalization ability of the wide residual ...
research
09/14/2023

How many Neurons do we need? A refined Analysis for Shallow Networks trained with Gradient Descent

We analyze the generalization properties of two-layer neural networks in...
research
02/14/2020

Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? – A Neural Tangent Kernel Perspective

Deep residual networks (ResNets) have demonstrated better generalization...
research
12/28/2022

Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks

We explore the ability of overparameterized shallow ReLU neural networks...
research
04/10/2019

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

The behavior of the gradient descent (GD) algorithm is analyzed for a de...

Please sign up or login with your details

Forgot password? Click here to reset