DeepAI AI Chat
Log In Sign Up

Preserved central model for faster bidirectional compression in distributed settings

by   Constantin Philippenko, et al.

We develop a new approach to tackle communication constraints in a distributed learning problem with a central server. We propose and analyze a new algorithm that performs bidirectional compression and achieves the same convergence rate as algorithms using only uplink (from the local workers to the central server) compression. To obtain this improvement, we design MCM, an algorithm such that the downlink compression only impacts local models, while the global model is preserved. As a result, and contrary to previous works, the gradients on local servers are computed on perturbed models. Consequently, convergence proofs are more challenging and require a precise control of this perturbation. To ensure it, MCM additionally combines model compression with a memory mechanism. This analysis opens new doors, e.g. incorporating worker dependent randomized-models and partial participation.


page 1

page 2

page 3

page 4


Artemis: tight convergence guarantees for bidirectional compression in Federated Learning

We introduce a new algorithm - Artemis - tackling the problem of learnin...

Downlink Compression Improves TopK Sparsification

Training large neural networks is time consuming. To speed up the proces...

Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with Communication Compression

Recent advances in distributed optimization and learning have shown that...

Communication-Efficient Adam-Type Algorithms for Distributed Data Mining

Distributed data mining is an emerging research topic to effectively and...

EF21 with Bells Whistles: Practical Algorithmic Extensions of Modern Error Feedback

First proposed by Seide (2014) as a heuristic, error feedback (EF) is a ...

Bidirectional Text Compression in External Memory

Bidirectional compression algorithms work by substituting repeated subst...