DeepAI AI Chat
Log In Sign Up

Quantitative Propagation of Chaos for SGD in Wide Neural Networks

07/13/2020
by   Valentin De Bortoli, et al.
0

In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (ie, the size of the hidden layer) N → +∞. Following a probabilistic approach, we show 'propagation of chaos' for the particle system defined by this continuous-time dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to N of any particle to a solution of a mean-field McKean-Vlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different mean-field limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand. We perform various experiments on real datasets to validate our theoretical results, assessing the existence of these two regimes on classification problems and illustrating our convergence results.

READ FULL TEXT
05/19/2022

Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates

We consider optimizing two-layer neural networks in the mean-field regim...
06/10/2020

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

We analyze in a closed form the learning dynamics of stochastic gradient...
01/25/2022

Convex Analysis of the Mean Field Langevin Dynamics

As an example of the nonlinear Fokker-Planck equation, the mean field La...
11/03/2021

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Understanding the properties of neural networks trained via stochastic g...
10/13/2022

Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence

The stochastic heavy ball method (SHB), also known as stochastic gradien...
10/27/2022

Stochastic Mirror Descent in Average Ensemble Models

The stochastic mirror descent (SMD) algorithm is a general class of trai...
01/24/2023

Sequential propagation of chaos

A new class of particle systems with sequential interaction is proposed ...