Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization

by   Grigory Malinovsky, et al.

We present a theoretical study of server-side optimization in federated learning. Our results are the first to show that the widely popular heuristic of scaling the client updates with an extra parameter is very useful in the context of Federated Averaging (FedAvg) with local passes over the client data. Each local pass is performed without replacement using Random Reshuffling, which is a key reason we can show improved complexities. In particular, we prove that whenever the local stepsizes are small, and the update direction is given by FedAvg in conjunction with Random Reshuffling over all clients, one can take a big leap in the obtained direction and improve rates for convex, strongly convex, and non-convex objectives. In particular, in non-convex regime we get an enhancement of the rate of convergence from 𝒪(ε^-3) to 𝒪(ε^-2). This result is new even for Random Reshuffling performed on a single node. In contrast, if the local stepsizes are large, we prove that the noise of client sampling can be controlled by using a small server-side stepsize. To the best of our knowledge, this is the first time that local steps provably help to overcome the communication bottleneck. Together, our results on the advantage of large and small server-side stepsizes give a formal justification for the practice of adaptive server-side optimization in federated learning. Moreover, we consider a variant of our algorithm that supports partial client participation, which makes the method more practical.


page 1

page 2

page 3

page 4


Improved Convergence Rates for Non-Convex Federated Learning with Compression

Federated learning is a new distributed learning paradigm that enables e...

FedADMM: A Federated Primal-Dual Algorithm Allowing Partial Participation

Federated learning is a framework for distributed optimization that plac...

FedVARP: Tackling the Variance Due to Partial Client Participation in Federated Learning

Data-heterogeneous federated learning (FL) systems suffer from two signi...

SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning

Federated bilevel optimization (FBO) has shown great potential recently ...

Toward Understanding the Influence of Individual Clients in Federated Learning

Federated learning allows mobile clients to jointly train a global model...

Accelerating Federated Learning via Sampling Anchor Clients with Large Batches

Using large batches in recent federated learning studies has improved co...

Federated Learning in the Presence of Adversarial Client Unavailability

Federated learning is a decentralized machine learning framework wherein...

Please sign up or login with your details

Forgot password? Click here to reset