Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

by   Eduard Gorbunov, et al.

Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However, many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings.


page 1

page 2

page 3

page 4


BROADCAST: Reducing Both Stochastic and Compression Noise to Robustify Communication-Efficient Federated Learning

Communication between workers and the master node to collect local stoch...

Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient

Distributed optimization with open collaboration is a popular field sinc...

Artemis: tight convergence guarantees for bidirectional compression in Federated Learning

We introduce a new algorithm - Artemis - tackling the problem of learnin...

MARINA: Faster Non-Convex Distributed Learning with Compression

We develop and analyze MARINA: a new communication efficient method for ...

Distributed Training with Heterogeneous Data: Bridging Median and Mean Based Algorithms

Recently, there is a growing interest in the study of median-based algor...

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting

We present a new method that includes three key components of distribute...

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

The starting point of this paper is the discovery of a novel and simple ...

Please sign up or login with your details

Forgot password? Click here to reset