From the Ravine method to the Nesterov method and vice versa: a dynamical system perspective

01/27/2022

∙

We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. They can be deduced from each other by reversing the order of the extrapolation and gradient operations in their definitions. They benefit from similar fast convergence of values and convergence of iterates for general convex objective functions. We will also establish the high resolution ODE of the Ravine and Nesterov methods, and reveal an additional geometric damping term driven by the Hessian for both methods. This will allow us to prove fast convergence towards zero of the gradients not only for the Ravine method but also for the Nesterov method for the first time. We also highlight connections to other algorithms stemming from more subtle discretization schemes, and finally describe a Ravine version of the proximal-gradient algorithms for general structured smooth + non-smooth convex optimization problems.

READ FULL TEXT

From the Ravine method to the Nesterov method and vice versa: a dynamical system perspective

Sign in with Google

Consider DeepAI Pro