A Caputo fractional derivative-based algorithm for optimization

by   Yeonjong Shin, et al.

We propose a novel Caputo fractional derivative-based optimization algorithm. Upon defining the Caputo fractional gradient with respect to the Cartesian coordinate, we present a generic Caputo fractional gradient descent (CFGD) method. We prove that the CFGD yields the steepest descent direction of a locally smoothed objective function. The generic CFGD requires three parameters to be specified, and a choice of the parameters yields a version of CFGD. We propose three versions – non-adaptive, adaptive terminal and adaptive order. By focusing on quadratic objective functions, we provide a convergence analysis. We prove that the non-adaptive CFGD converges to a Tikhonov regularized solution. For the two adaptive versions, we derive error bounds, which show convergence to integer-order stationary point under some conditions. We derive an explicit formula of CFGD for quadratic functions. We computationally found that the adaptive terminal (AT) CFGD mitigates the dependence on the condition number in the rate of convergence and results in significant acceleration over gradient descent (GD). For non-quadratic functions, we develop an efficient implementation of CFGD using the Gauss-Jacobi quadrature, whose computational cost is approximately proportional to the number of the quadrature points and the cost of GD. Our numerical examples show that AT-CFGD results in acceleration over GD, even when a small number of the Gauss-Jacobi quadrature points (including a single point) is used.


The Novel Adaptive Fractional Order Gradient Decent Algorithms Design via Robust Control

The vanilla fractional order gradient descent may oscillatively converge...

Local Convergence of Adaptive Gradient Descent Optimizers

Adaptive Moment Estimation (ADAM) is a very popular training algorithm f...

RES: Regularized Stochastic BFGS Algorithm

RES, a regularized stochastic version of the Broyden-Fletcher-Goldfarb-S...

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

We consider Sharpness-Aware Minimization (SAM), a gradient-based optimiz...

Quadratic speedup of global search using a biased crossover of two good solutions

The minimisation of cost functions is crucial in various optimisation fi...

A Deterministic Approach to Avoid Saddle Points

Loss functions with a large number of saddle points are one of the main ...

Symmetry Teleportation for Accelerated Optimization

Existing gradient-based optimization methods update the parameters local...

Please sign up or login with your details

Forgot password? Click here to reset