Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
We consider the optimization problem of the form min_x ∈ℝ^d f(x) ≜𝔼_ξ [F(x; ξ)], where the component F(x;ξ) is L-mean-squared Lipschitz but possibly nonconvex and nonsmooth. The recently proposed gradient-free method requires at most 𝒪( L^4 d^3/2ϵ^-4 + Δ L^3 d^3/2δ^-1ϵ^-4) stochastic zeroth-order oracle complexity to find a (δ,ϵ)-Goldstein stationary point of objective function, where Δ = f(x_0) - inf_x ∈ℝ^d f(x) and x_0 is the initial point of the algorithm. This paper proposes a more efficient algorithm using stochastic recursive gradient estimators, which improves the complexity to 𝒪(L^3 d^3/2ϵ^-3+ Δ L^2 d^3/2δ^-1ϵ^-3).
READ FULL TEXT