Decentralized gradient methods: does topology matter?

02/28/2020
by   Giovanni Neglia, et al.
8

Consensus-based distributed optimization methods have recently been advocated as alternatives to parameter server and ring all-reduce paradigms for large scale training of machine learning models. In this case, each worker maintains a local estimate of the optimal parameter vector and iteratively updates it by averaging the estimates obtained from its neighbors, and applying a correction on the basis of its local dataset. While theoretical results suggest that worker communication topology should have strong impact on the number of epochs needed to converge, previous experiments have shown the opposite conclusion. This paper sheds lights on this apparent contradiction and show how sparse topologies can lead to faster convergence even in the absence of communication delays.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2021

Straggler-Resilient Distributed Machine Learning with Dynamic Backup Workers

With the increasing demand for large-scale training of machine learning ...
research
06/11/2023

Straggler-Resilient Decentralized Learning via Adaptive Asynchronous Updates

With the increasing demand for large-scale training of machine learning ...
research
06/01/2023

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm

Decentralized Stochastic Gradient Descent (SGD) is an emerging neural ne...
research
10/21/2020

Decentralized Deep Learning using Momentum-Accelerated Consensus

We consider the problem of decentralized deep learning where multiple ag...
research
05/19/2023

Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

Decentralized learning has recently been attracting increasing attention...
research
01/05/2023

Beyond spectral gap (extended): The role of the topology in decentralized learning

In data-parallel optimization of machine learning models, workers collab...
research
06/07/2022

Beyond spectral gap: The role of the topology in decentralized learning

In data-parallel optimization of machine learning models, workers collab...

Please sign up or login with your details

Forgot password? Click here to reset