Understanding the unstable convergence of gradient descent

04/03/2022
by   Kwangjun Ahn, et al.
0

Most existing analyses of (stochastic) gradient descent rely on the condition that for L-smooth cost, the step size is less than 2/L. However, many works have observed that in machine learning applications step sizes often do not fulfill this condition, yet (stochastic) gradient descent converges, albeit in an unstable manner. We investigate this unstable convergence phenomenon from first principles, and elucidate key causes behind it. We also identify its main characteristics, and how they interrelate, offering a transparent view backed by both theory and experiments.

READ FULL TEXT
research
12/30/2021

Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size

Establishing a fast rate of convergence for optimization methods is cruc...
research
07/09/2020

Stochastic gradient descent for linear least squares problems with partially observed data

We propose a novel stochastic gradient descent method for solving linear...
research
04/16/2018

Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling

Stochastic gradient methods enable learning probabilistic models from la...
research
02/07/2020

On the Effectiveness of Richardson Extrapolation in Machine Learning

Richardson extrapolation is a classical technique from numerical analysi...
research
09/19/2016

Geometrically Convergent Distributed Optimization with Uncoordinated Step-Sizes

A recent algorithmic family for distributed optimization, DIGing's, have...
research
05/02/2023

Random Function Descent

While gradient based methods are ubiquitous in machine learning, selecti...
research
05/26/2015

Surrogate Functions for Maximizing Precision at the Top

The problem of maximizing precision at the top of a ranked list, often d...

Please sign up or login with your details

Forgot password? Click here to reset