Faster Rates for Compressed Federated Learning with Client-Variance Reduction

by   Haoyu Zhao, et al.

Due to the communication bottleneck in distributed and federated learning applications, algorithms using communication compression have attracted significant attention and are widely used in practice. Moreover, the huge number, high heterogeneity and limited availability of clients result in high client-variance. This paper addresses these two issues together by proposing compressed and client-variance reduced methods COFIG and FRECON. We prove an O((1+ω)^3/2√(N)/Sϵ^2+(1+ω)N^2/3/Sϵ^2) bound on the number of communication rounds of COFIG in the nonconvex setting, where N is the total number of clients, S is the number of clients participating in each round, ϵ is the convergence error, and ω is the variance parameter associated with the compression operator. In case of FRECON, we prove an O((1+ω)√(N)/Sϵ^2) bound on the number of communication rounds. In the convex setting, COFIG converges within O((1+ω)√(N)/Sϵ) communication rounds, which is also the first convergence result for compression schemes that do not communicate with all the clients in each round. We stress that neither COFIG nor FRECON needs to communicate with all the clients, and they enjoy the first or faster convergence results for convex and nonconvex federated learning in the regimes considered. Experimental results point to an empirical superiority of COFIG and FRECON over existing baselines.


page 1

page 2

page 3

page 4


FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning

Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2...

BEER: Fast O(1/T) Rate for Decentralized Nonconvex Optimization with Communication Compression

Communication efficiency has been widely recognized as the bottleneck fo...

Optimal Client Sampling for Federated Learning

It is well understood that client-master communication can be a primary ...

Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor

In order to mitigate the high communication cost in distributed and fede...

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Due to the high communication cost in distributed and federated learning...

FetchSGD: Communication-Efficient Federated Learning with Sketching

Existing approaches to federated learning suffer from a communication bo...

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting

We present a new method that includes three key components of distribute...

Please sign up or login with your details

Forgot password? Click here to reset