Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods

by   Yujia Jin, et al.

We design accelerated algorithms with improved rates for several fundamental classes of optimization problems. Our algorithms all build upon techniques related to the analysis of primal-dual extragradient methods via relative Lipschitzness proposed recently by [CST21]. (1) Separable minimax optimization. We study separable minimax optimization problems min_x max_y f(x) - g(y) + h(x, y), where f and g have smoothness and strong convexity parameters (L^x, μ^x), (L^y, μ^y), and h is convex-concave with a (Λ^xx, Λ^xy, Λ^yy)-blockwise operator norm bounded Hessian. We provide an algorithm with gradient query complexity Õ(√(L^x/μ^x) + √(L^y/μ^y) + Λ^xx/μ^x + Λ^xy/√(μ^xμ^y) + Λ^yy/μ^y). Notably, for convex-concave minimax problems with bilinear coupling (e.g. quadratics), where Λ^xx = Λ^yy = 0, our rate matches a lower bound of [ZHZ19]. (2) Finite sum optimization. We study finite sum optimization problems min_x 1/n∑_i∈[n] f_i(x), where each f_i is L_i-smooth and the overall problem is μ-strongly convex. We provide an algorithm with gradient query complexity Õ(n + ∑_i∈[n]√(L_i/nμ)). Notably, when the smoothness bounds {L_i}_i∈[n] are non-uniform, our rate improves upon accelerated SVRG [LMH15, FGKS15] and Katyusha [All17] by up to a √(n) factor. (3) Minimax finite sums. We generalize our algorithms for minimax and finite sum optimization to solve a natural family of minimax finite sum optimization problems at an accelerated rate, encapsulating both above results up to a logarithmic factor.


page 1

page 2

page 3

page 4


Improved Algorithms for Convex-Concave Minimax Optimization

This paper studies minimax optimization problems min_x max_y f(x,y), whe...

Near-Optimal Algorithms for Minimax Optimization

This paper resolves a longstanding open question pertaining to the desig...

Stochastic Distributed Optimization under Average Second-order Similarity: Algorithms and Analysis

We study finite-sum distributed optimization problems with n-clients und...

Fast Saddle-Point Algorithm for Generalized Dantzig Selector and FDR Control with the Ordered l1-Norm

In this paper we propose a primal-dual proximal extragradient algorithm ...

Nonsmooth Convex Joint Estimation of Local Regularity and Local Variance for Fractal Texture Segmentation

Fractal models are widely used to describe realworld textures in numerou...

SGD with shuffling: optimal rates without component convexity and large epoch requirements

We study without-replacement SGD for solving finite-sum optimization pro...

Nesterov Meets Optimism: Rate-Optimal Optimistic-Gradient-Based Method for Stochastic Bilinearly-Coupled Minimax Optimization

We provide a novel first-order optimization algorithm for bilinearly-cou...

Please sign up or login with your details

Forgot password? Click here to reset