Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

02/16/2019
by   Song Mei, et al.
0

We consider learning two layer neural networks using stochastic gradient descent. The mean-field description of this learning dynamics approximates the evolution of the network weights by an evolution in the space of probability distributions in R^D (where D is the number of parameters associated to each neuron). This evolution can be defined through a partial differential equation or, equivalently, as the gradient flow in the Wasserstein space of probability distributions. Earlier work shows that (under some regularity assumptions), the mean field description is accurate as soon as the number of hidden units is much larger than the dimension D. In this paper we establish stronger and more general approximation guarantees. First of all, we show that the number of hidden units only needs to be larger than a quantity dependent on the regularity properties of the data, and independent of the dimensions. Next, we generalize this analysis to the case of unbounded activation functions, which was not covered by earlier bounds. We extend our results to noisy stochastic gradient descent. Finally, we show that kernel ridge regression can be recovered as a special limit of the mean field analysis.

READ FULL TEXT
research
07/12/2022

Conservative SPDEs as fluctuating mean field limits of stochastic gradient descent

The convergence of stochastic interacting particle systems in the mean-f...
research
10/27/2022

Mean-field neural networks: learning mappings on Wasserstein space

We study the machine learning task for models with operators mapping bet...
research
02/05/2019

Global convergence of neuron birth-death dynamics

Neural networks with a large number of parameters admit a mean-field des...
research
10/27/2022

Stochastic Mirror Descent in Average Ensemble Models

The stochastic mirror descent (SMD) algorithm is a general class of trai...
research
02/05/2020

A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics

We consider the problem of universal approximation of functions by two-l...
research
06/20/2023

Mean-field Analysis of Generalization Errors

We propose a novel framework for exploring weak and L_2 generalization e...
research
11/20/2020

Normalization effects on shallow neural networks and related asymptotic expansions

We consider shallow (single hidden layer) neural networks and characteri...

Please sign up or login with your details

Forgot password? Click here to reset