First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

11/03/2017
by   Yi Xu, et al.
0

Two classes of methods have been proposed for escaping from saddle points with one using the second-order information carried by the Hessian and the other adding the noise into the first-order information. The existing analysis for algorithms using noise in the first-order information is quite involved and hides the essence of added noise, which hinder further improvements of these algorithms. In this paper, we present a novel perspective of noise-adding technique, i.e., adding the noise into the first-order information can help extract the negative curvature from the Hessian matrix, and provide a formal reasoning of this perspective by analyzing a simple first-order procedure. More importantly, the proposed procedure enables one to design purely first-order stochastic algorithms for escaping from non-degenerate saddle points with a much better time complexity (almost linear time in terms of the problem's dimensionality). In particular, we develop a first-order stochastic algorithm based on our new technique and an existing algorithm that only converges to a first-order stationary point to enjoy a time complexity of O(d/ϵ^3.5) for finding a nearly second-order stationary point x such that ∇ F(bfx)≤ϵ and ∇^2 F(bfx)≥ -√(ϵ)I (in high probability), where F(·) denotes the objective function and d is the dimensionality of the problem. To the best of our knowledge, this is the best theoretical result of first-order algorithms for stochastic non-convex optimization, which is even competitive with if not better than existing stochastic algorithms hinging on the second-order information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2017

Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence

In this paper, we study stochastic non-convex optimization with non-conv...
research
09/25/2017

On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

The Hessian-vector product has been utilized to find a second-order stat...
research
10/25/2021

On the Second-order Convergence Properties of Random Search Methods

We study the theoretical convergence properties of random-search methods...
research
05/07/2018

Implementation of Stochastic Quasi-Newton's Method in PyTorch

In this paper, we implement the Stochastic Damped LBFGS (SdLBFGS) for st...
research
01/24/2019

Curvature-Exploiting Acceleration of Elastic Net Computations

This paper introduces an efficient second-order method for solving the e...
research
11/08/2022

The Hypervolume Indicator Hessian Matrix: Analytical Expression, Computational Time Complexity, and Sparsity

The problem of approximating the Pareto front of a multiobjective optimi...
research
05/09/2023

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Adam-type algorithms have become a preferred choice for optimisation in ...

Please sign up or login with your details

Forgot password? Click here to reset