Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

11/28/2017
by   Chi Jin, et al.
0

Nesterov's accelerated gradient descent (AGD), an instance of the general family of "momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting. However, whether these methods are superior to GD in the nonconvex setting remains open. This paper studies a simple variant of AGD, and shows that it escapes saddle points and finds a second-order stationary point in Õ(1/ϵ^7/4) iterations, faster than the Õ(1/ϵ^2) iterations required by GD. To the best of our knowledge, this is the first Hessian-free algorithm to find a second-order stationary point faster than GD, and also the first single-loop algorithm with a faster rate than GD even in the setting of finding a first-order stationary point. Our analysis is based on two key ideas: (1) the use of a simple Hamiltonian function, inspired by a continuous-time perspective, which AGD monotonically decreases per step even for nonconvex functions, and (2) a novel framework called improve or localize, which is useful for tracking the long-term behavior of gradient-based optimization algorithms. We believe that these techniques may deepen our understanding of both acceleration algorithms and nonconvex optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2019

Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

Variance reduction techniques like SVRG provide simple and fast algorith...
research
01/27/2022

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the O(ε^-7/4) Complexity

This paper studies the accelerated gradient descent for general nonconve...
research
07/12/2023

Provably Faster Gradient Descent via Long Steps

This work establishes provably faster convergence rates for gradient des...
research
07/07/2023

Accelerated Optimization Landscape of Linear-Quadratic Regulator

Linear-quadratic regulator (LQR) is a landmark problem in the field of o...
research
05/27/2022

Competitive Gradient Optimization

We study the problem of convergence to a stationary point in zero-sum ga...
research
06/02/2019

Generalized Momentum-Based Methods: A Hamiltonian Perspective

We take a Hamiltonian-based perspective to generalize Nesterov's acceler...
research
03/02/2017

The Second Order Linear Model

We study a fundamental class of regression models called the second orde...

Please sign up or login with your details

Forgot password? Click here to reset