A Control-Theoretic Perspective on Optimal High-Order Optimization
In this paper, we provide a control-theoretic perspective on optimal tensor optimization algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function Φ: R^d →R that is convex and twice-continuously differentiable, we study an ordinary differential equation (ODE) that is governed by the gradient operator ∇Φ and a positive control parameter λ(t) that tends to infinity as t → +∞. The tuning of λ(·) is achieved via a closed-loop control law based on the algebraic equation[λ(t)]^p∇Φ(x(t))^p-1 = θ for a given θ > 0. We prove the existence and uniqueness of a local solution to this closed-loop ODE by the Banach fixed-point theorem. We then present a Lyapunov function that allows us to establish the existence and uniqueness of a global solution and analyze the convergence properties of trajectories. The rate of convergence is O(t^-(3p+1)/2) in terms of objective gap and O(t^-3p) in terms of squared gradient norm. We present two frameworks for implicit time discretization of the ODE, one of which generalizes the large-step A-HPE framework of <cit.>, and the other of which leads to a new p-th order tensor algorithm. A highlight of our analysis is that we show that all of the p-th order optimal tensor algorithms in this paper minimize the squared gradient norm at a rate of O(k^-3p).
READ FULL TEXT 
  
  
     share
 share