A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics

02/05/2020
by   Belinda Tzen, et al.
0

We consider the problem of universal approximation of functions by two-layer neural nets with random weights that are "nearly Gaussian" in the sense of Kullback-Leibler divergence. This problem is motivated by recent works on lazy training, where the weight updates generated by stochastic gradient descent do not move appreciably from the i.i.d. Gaussian initialization. We first consider the mean-field limit, where the finite population of neurons in the hidden layer is replaced by a continual ensemble, and show that our problem can be phrased as global minimization of a free-energy functional on the space of probability measures over the weights. This functional trades off the L^2 approximation risk against the KL divergence with respect to a centered Gaussian prior. We characterize the unique global minimizer and then construct a controlled nonlinear dynamics in the space of probability measures over weights that solves a McKean–Vlasov optimal control problem. This control problem is closely related to the Schrödinger bridge (or entropic optimal transport) problem, and its value is proportional to the minimum of the free energy. Finally, we show that SGD in the lazy training regime (which can be ensured by jointly tuning the variance of the Gaussian prior and the entropic regularization parameter) serves as a greedy approximation to the optimal McKean–Vlasov distributional dynamics and provide quantitative guarantees on the L^2 approximation error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2023

Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction

The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of...
research
05/19/2022

Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates

We consider optimizing two-layer neural networks in the mean-field regim...
research
10/22/2020

Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime

We study the problem of policy optimization for infinite-horizon discoun...
research
02/16/2019

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

We consider learning two layer neural networks using stochastic gradient...
research
02/28/2023

Deep Learning for Mean Field Optimal Transport

Mean field control (MFC) problems have been introduced to study social o...
research
11/03/2021

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Understanding the properties of neural networks trained via stochastic g...
research
02/16/2021

Analysis of feature learning in weight-tied autoencoders via the mean field lens

Autoencoders are among the earliest introduced nonlinear models for unsu...

Please sign up or login with your details

Forgot password? Click here to reset