LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning

02/26/2020
by   Tianyi Chen, et al.
7

This paper targets solving distributed machine learning problems such as federated learning in a communication-efficient fashion. A class of new stochastic gradient descent (SGD) approaches have been developed, which can be viewed as the stochastic generalization to the recently developed lazily aggregated gradient (LAG) method — justifying the name LASG. LAG adaptively predicts the contribution of each round of communication and chooses only the significant ones to perform. It saves communication while also maintains the rate of convergence. However, LAG only works with deterministic gradients, and applying it to stochastic gradients yields poor performance. The key components of LASG are a set of new rules tailored for stochastic gradients that can be implemented either to save download, upload, or both. The new algorithms adaptively choose between fresh and stale stochastic gradients and have convergence rates comparable to the original SGD. LASG achieves impressive empirical performance — it typically saves total communication by an order of magnitude.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2020

CADA: Communication-Adaptive Distributed Adam

Stochastic gradient descent (SGD) has taken the stage as the primary wor...
research
09/17/2019

Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients

The present paper develops a novel aggregated gradient approach for dist...
research
05/25/2018

LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning

This paper presents a new class of gradient methods for distributed mach...
research
12/15/2021

Communication-Efficient Distributed SGD with Compressed Sensing

We consider large scale distributed optimization over a set of edge devi...
research
02/27/2020

On Biased Compression for Distributed Learning

In the last few years, various communication compression techniques have...
research
05/26/2022

Active Labeling: Streaming Stochastic Gradients

The workhorse of machine learning is stochastic gradient descent. To acc...

Please sign up or login with your details

Forgot password? Click here to reset