Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox

by   Abdurakhmon Sadiev, et al.

Inspired by a recent breakthrough of Mishchenko et al (2022), who for the first time showed that local gradient steps can lead to provable communication acceleration, we propose an alternative algorithm which obtains the same communication acceleration as their method (ProxSkip). Our approach is very different, however: it is based on the celebrated method of Chambolle and Pock (2011), with several nontrivial modifications: i) we allow for an inexact computation of the prox operator of a certain smooth strongly convex function via a suitable gradient-based method (e.g., GD, Fast GD or FSFOM), ii) we perform a careful modification of the dual update step in order to retain linear convergence. Our general results offer the new state-of-the-art rates for the class of strongly convex-concave saddle-point problems with bilinear coupling characterized by the absence of smoothness in the dual function. When applied to federated learning, we obtain a theoretically better alternative to ProxSkip: our method requires fewer local steps (O(κ^1/3) or O(κ^1/4), compared to O(κ^1/2) of ProxSkip), and performs a deterministic number of local steps instead. Like ProxSkip, our method can be applied to optimization over a connected network, and we obtain theoretical improvements here as well.


page 1

page 2

page 3

page 4


Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems

In this work, we aim to investigate Primal-Dual (PD) methods for convex-...

DualFL: A Duality-based Federated Learning Algorithm with Communication Acceleration in the General Convex Regime

We propose a novel training algorithm called DualFL (Dualized Federated ...

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

We introduce ProxSkip – a surprisingly simple and provably efficient met...

Can 5th Generation Local Training Methods Support Client Sampling? Yes!

The celebrated FedAvg algorithm of McMahan et al. (2017) is based on thr...

Revisiting the Primal-Dual Method of Multipliers for Optimisation over Centralised Networks

The primal-dual method of multipliers (PDMM) was originally designed for...

A dual approach for federated learning

We study the federated optimization problem from a dual perspective and ...

Nesterov Meets Optimism: Rate-Optimal Optimistic-Gradient-Based Method for Stochastic Bilinearly-Coupled Minimax Optimization

We provide a novel first-order optimization algorithm for bilinearly-cou...

Please sign up or login with your details

Forgot password? Click here to reset