Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

by   Jun-Kun Wang, et al.

The Heavy Ball Method, proposed by Polyak over five decades ago, is a first-order method for optimizing continuous functions. While its stochastic counterpart has proven extremely popular in training deep networks, there are almost no known functions where deterministic Heavy Ball is provably faster than the simple and classical gradient descent algorithm in non-convex optimization. The success of Heavy Ball has thus far eluded theoretical understanding. Our goal is to address this gap, and in the present work we identify two non-convex problems where we provably show that the Heavy Ball momentum helps the iterate to enter a benign region that contains a global optimal point faster. We show that Heavy Ball exhibits simple dynamics that clearly reveal the benefit of using a larger value of momentum parameter for the problems. The first of these optimization problems is the phase retrieval problem, which has useful applications in physical science. The second of these optimization problems is the cubic-regularized minimization, a critical subroutine required by Nesterov-Polyak cubic-regularized method to find second-order stationary points in general smooth non-convex problems.


page 1

page 2

page 3

page 4


Stochastic Heavy Ball

This paper deals with a natural stochastic optimization procedure derive...

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization

Accelerated gradient (AG) methods are breakthroughs in convex optimizati...

Just a Momentum: Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problem

When optimizing over loss functions it is common practice to use momentu...

Accelerated Gossip via Stochastic Heavy Ball Method

In this paper we show how the stochastic heavy ball method (SHB) -- a po...

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

Heavy Ball (HB) nowadays is one of the most popular momentum methods in ...

A Momentum Accelerated Adaptive Cubic Regularization Method for Nonconvex Optimization

The cubic regularization method (CR) and its adaptive version (ARC) are ...

AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation

Recent work by Xia et al. leveraged the continuous-limit of the classica...

Please sign up or login with your details

Forgot password? Click here to reset