Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation

02/02/2019
by   Yuanyuan Feng, et al.
0

Diffusion approximation provides weak approximation for stochastic gradient descent algorithms in a finite time horizon. In this paper, we introduce new tools motivated by the backward error analysis of numerical stochastic differential equations into the theoretical framework of diffusion approximation, extending the validity of the weak approximation from finite to infinite time horizon. The new techniques developed in this paper enable us to characterize the asymptotic behavior of constant-step-size SGD algorithms for strongly convex objective functions, a goal previously unreachable within the diffusion approximation framework. Our analysis builds upon a truncated formal power expansion of the solution of a stochastic modified equation arising from diffusion approximation, where the main technical ingredient is a uniform-in-time weak error bound controlling the long-term behavior of the expansion coefficient functions near the global minimum. We expect these new techniques to greatly expand the range of applicability of diffusion approximation to cover wider and deeper aspects of stochastic optimization algorithms in data science.

READ FULL TEXT
research
07/11/2022

On uniform-in-time diffusion approximation for stochastic gradient descent

The diffusion approximation of stochastic gradient descent (SGD) in curr...
research
11/05/2018

Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations

We develop the mathematical foundations of the stochastic modified equat...
research
06/17/2021

Sub-linear convergence of a tamed stochastic gradient descent method in Hilbert space

In this paper, we introduce the tamed stochastic gradient descent method...
research
10/04/2016

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure

Stochastic optimization algorithms with variance reduction have proven s...
research
09/02/2017

A convergence analysis of the perturbed compositional gradient flow: averaging principle and normal deviations

We consider in this work a system of two stochastic differential equatio...
research
07/17/2023

Weak approximation for stochastic Reaction-diffusion equation near sharp interface limit

It is known that when the diffuse interface thickness ϵ vanishes, the sh...

Please sign up or login with your details

Forgot password? Click here to reset