Almost Sure Convergence Rates of Stochastic Zeroth-order Gradient Descent for Łojasiewicz Functions
We prove almost sure convergence rates of Zeroth-order Gradient Descent (SZGD) algorithms for Łojasiewicz functions. The SZGD algorithm iterates as x_t+1 = x_t - η_t ∇ f (x_t), t = 0,1,2,3,⋯ , where f is the objective function that satisfies the Łojasiewicz inequality with Łojasiewicz exponent θ, η_t is the step size (learning rate), and ∇ f (x_t) is the approximate gradient estimated using zeroth-order information. We show that, for smooth Łojasiewicz functions, the sequence { x_t }_t∈ℕ governed by SZGD converges to a bounded point x_∞ almost surely, and x_∞ is a critical point of f. If θ∈ (0,1/2], f (x_t) - f (x_∞), ∑_s=t^∞ x_s - x_∞^2 and x_t - x_∞ (· is the Euclidean norm) converge to zero linearly almost surely. If θ∈ (1/2, 1), then f (x_t) - f (x_∞) (and ∑_s=t^∞ x_s+1 - x_s ^2) converges to zero at rate o ( t^1/1 - 2θlog t ) almost surely; x_t - x_∞ converges to zero at rate o ( t^1-θ/1-2θlog t ) almost surely. To the best of our knowledge, this paper provides the first almost sure convergence rate guarantee for stochastic zeroth order algorithms for Łojasiewicz functions.
READ FULL TEXT 
  
  
     share
 share