The promises and pitfalls of Stochastic Gradient Langevin Dynamics

11/25/2018
by   Nicolas Brosse, et al.
0

Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorithm for Bayesian learning from large scale datasets. While SGLD with decreasing step sizes converges weakly to the posterior distribution, the algorithm is often used with a constant step size in practice and has demonstrated successes in machine learning tasks. The current practice is to set the step size inversely proportional to N where N is the number of training samples. As N becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like Stochastic Gradient Descent (SGD). This difference is inherently due to the high variance of the stochastic gradients. Several strategies have been suggested to reduce this effect; among them, SGLD Fixed Point (SGLDFP) uses carefully designed control variates to reduce the variance of the stochastic gradients. We show that SGLDFP gives approximate samples from the posterior distribution, with an accuracy comparable to the Langevin Monte Carlo (LMC) algorithm for a computational cost sublinear in the number of data points. We provide a detailed analysis of the Wasserstein distances between LMC, SGLD, SGLDFP and SGD and explicit expressions of the means and covariance matrices of their invariant distributions. Our findings are supported by limited numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2016

Barzilai-Borwein Step Size for Stochastic Gradient Descent

One of the major issues in stochastic gradient descent (SGD) methods is ...
research
06/16/2017

Control Variates for Stochastic Gradient MCMC

It is well known that Markov chain Monte Carlo (MCMC) methods scale poor...
research
01/02/2015

(Non-) asymptotic properties of Stochastic Gradient Langevin Dynamics

Applying standard Markov chain Monte Carlo (MCMC) algorithms to large da...
research
02/29/2020

AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC

Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient meth...
research
12/18/2022

Pigeonhole Stochastic Gradient Langevin Dynamics for Large Crossed Mixed Effects Models

Large crossed mixed effects models with imbalanced structures and missin...
research
06/29/2020

Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications

In this work, we propose a Bayesian type sparse deep learning algorithm....
research
06/06/2023

Machine learning in and out of equilibrium

The algorithms used to train neural networks, like stochastic gradient d...

Please sign up or login with your details

Forgot password? Click here to reset