FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging

02/13/2023
by   Junyi Li, et al.
0

Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data. In Federated Learning, a set of clients jointly perform a machine learning task under the coordination of a server. The FedAvg algorithm is one of the most widely used methods to solve Federated Learning problems. In FedAvg, the learning rate is a constant rather than changing adaptively. The adaptive gradient methods show superior performance over the constant learning rate schedule; however, there is still no general framework to incorporate adaptive gradient methods into the federated setting. In this paper, we propose FedDA, a novel framework for local adaptive gradient methods. The framework adopts a restarted dual averaging technique and is flexible with various gradient estimation methods and adaptive learning rate formulations. In particular, we analyze FedDA-MVR, an instantiation of our framework, and show that it achieves gradient complexity Õ(ϵ^-1.5) and communication complexity Õ(ϵ^-1) for finding a stationary point ϵ. This matches the best known rate for first-order FL algorithms and FedDA-MVR is the first adaptive FL algorithm that achieves this rate. We also perform extensive numerical experiments to verify the efficacy of our method.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset