Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

06/10/2020
by   Francesca Mignacco, et al.
57

We analyze in a closed form the learning dynamics of stochastic gradient descent (SGD) for a single layer neural network classifying a high-dimensional Gaussian mixture where each cluster is assigned one of two labels. This problem provides a prototype of a non-convex loss landscape with interpolating regimes and a large generalization gap. We define a particular stochastic process for which SGD can be extended to a continuous-time limit that we call stochastic gradient flow. In the full-batch limit we recover the standard gradient flow. We apply dynamical mean-field theory from statistical physics to track the dynamics of the algorithm in the high-dimensional limit via a self-consistent stochastic process. We explore the performance of the algorithm as a function of control parameters shedding light on how it navigates the loss landscape.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2023

Stochastic Gradient Descent outperforms Gradient Descent in recovering a high-dimensional signal in a glassy energy landscape

Stochastic Gradient Descent (SGD) is an out-of-equilibrium algorithm use...
research
10/12/2022

Rigorous dynamical mean field theory for stochastic gradient descent methods

We prove closed-form equations for the exact high-dimensional asymptotic...
research
03/08/2021

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

In this paper we investigate how gradient-based algorithms such as gradi...
research
02/01/2022

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Despite the non-convex optimization landscape, over-parametrized shallow...
research
03/23/2020

A classification for the performance of online SGD for high-dimensional inference

Stochastic gradient descent (SGD) is a popular algorithm for optimizatio...
research
02/12/2023

From high-dimensional mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks

This manuscript investigates the one-pass stochastic gradient descent (S...
research
06/08/2022

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

We study the scaling limits of stochastic gradient descent (SGD) with co...

Please sign up or login with your details

Forgot password? Click here to reset