Bridging the Gap between Constant Step Size Stochastic Gradient Descent and Markov Chains

07/20/2017
by   Aymeric Dieuleveut, et al.
0

We consider the minimization of an objective function given access to unbiased estimates of its gradient through stochastic gradient descent (SGD) with constant step-size. While the detailed analysis was only performed for quadratic functions, we provide an explicit asymptotic expansion of the moments of the averaged SGD iterates that outlines the dependence on initial conditions, the effect of noise and the step-size, as well as the lack of convergence in the general (non-quadratic) case. For this analysis, we bring tools from Markov chain theory into the analysis of stochastic gradient and create new ones (similar but different from stochastic MCMC methods). We then show that Richardson-Romberg extrapolation may be used to get closer to the global optimum and we show empirical improvements of the new extrapolation scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2019

Error Lower Bounds of Constant Step-size Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) plays a central role in modern machine...
research
02/18/2023

Parameter Averaging for SGD Stabilizes the Implicit Bias towards Flat Regions

Stochastic gradient descent is a workhorse for training deep neural netw...
research
06/20/2023

Convergence and concentration properties of constant step-size SGD through Markov chains

We consider the optimization of a smooth and strongly convex objective u...
research
11/29/2014

Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions

We consider the least-squares regression problem and provide a detailed ...
research
06/17/2021

Sub-linear convergence of a tamed stochastic gradient descent method in Hilbert space

In this paper, we introduce the tamed stochastic gradient descent method...
research
05/18/2020

Convergence of constant step stochastic gradient descent for non-smooth non-convex functions

This paper studies the asymptotic behavior of the constant step Stochast...
research
07/01/2020

On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent

Constant step-size Stochastic Gradient Descent exhibits two phases: a tr...

Please sign up or login with your details

Forgot password? Click here to reset