Continuous and Discrete-Time Analysis of Stochastic Gradient Descent for Convex and Non-Convex Functions

04/08/2020
by   Xavier Fontaine, et al.
0

This paper proposes a thorough theoretical analysis of Stochastic Gradient Descent (SGD) with decreasing step sizes. First, we show that the recursion defining SGD can be provably approximated by solutions of a time inhomogeneous Stochastic Differential Equation (SDE) in a weak and strong sense. Then, motivated by recent analyses of deterministic and stochastic optimization methods by their continuous counterpart, we study the long-time convergence of the continuous processes at hand and establish non-asymptotic bounds. To that purpose, we develop new comparison techniques which we think are of independent interest. This continuous analysis allows us to develop an intuition on the convergence of SGD and, adapting the technique to the discrete setting, we show that the same results hold to the corresponding sequences. In our analysis, we notably obtain non-asymptotic bounds in the convex setting for SGD under weaker assumptions than the ones considered in previous works. Finally, we also establish finite time convergence results under various conditions, including relaxations of the famous Łojasiewicz inequality, which can be applied to a class of non-convex functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2021

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

The gradient noise of Stochastic Gradient Descent (SGD) is considered to...
research
07/07/2019

Quantitative W_1 Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential State-Dependent Noise

We prove quantitative convergence rates at which discrete Langevin-like ...
research
07/11/2022

On uniform-in-time diffusion approximation for stochastic gradient descent

The diffusion approximation of stochastic gradient descent (SGD) in curr...
research
02/13/2017

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Sto...
research
10/09/2018

Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD

We study Stochastic Gradient Descent (SGD) with diminishing step sizes f...
research
04/01/2020

Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions

While Stochastic Gradient Descent (SGD) is a rather efficient algorithm ...
research
08/04/2021

Stochastic Subgradient Descent Escapes Active Strict Saddles

In non-smooth stochastic optimization, we establish the non-convergence ...

Please sign up or login with your details

Forgot password? Click here to reset