SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization

10/31/2019
by   Navjot Singh, et al.
11

In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number (H) of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that the SPARQ-SGD converges as O(1/nT) and O(1/√(nT)) in the strongly-convex and non-convex settings, respectively, demonstrating that such aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate as compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for "free". We evaluate SPARQ-SGD over real datasets to demonstrate significant amount of savings in communication over the state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

SQuARM-SGD: Communication-Efficient Momentum SGD for Decentralized Optimization

In this paper, we consider the problem of communication-efficient decent...
research
06/07/2023

Get More for Less in Decentralized Learning Systems

Decentralized learning (DL) systems have been gaining popularity because...
research
06/06/2019

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations

Communication bottleneck has been identified as a significant issue in d...
research
11/20/2020

On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

In decentralized optimization, it is common algorithmic practice to have...
research
02/26/2020

Moniqua: Modulo Quantized Communication in Decentralized SGD

Running Stochastic Gradient Descent (SGD) in a decentralized fashion has...
research
09/08/2019

Distributed Deep Learning with Event-Triggered Communication

We develop a Distributed Event-Triggered Stochastic GRAdient Descent (DE...
research
06/05/2023

Improved Stability and Generalization Analysis of the Decentralized SGD Algorithm

This paper presents a new generalization error analysis for the Decentra...

Please sign up or login with your details

Forgot password? Click here to reset