Network-Density-Controlled Decentralized Parallel Stochastic Gradient Descent in Wireless Systems

by   Koya Sato, et al.

This paper proposes a communication strategy for decentralized learning on wireless systems. Our discussion is based on the decentralized parallel stochastic gradient descent (D-PSGD), which is one of the state-of-the-art algorithms for decentralized learning. The main contribution of this paper is to raise a novel open question for decentralized learning on wireless systems: there is a possibility that the density of a network topology significantly influences the runtime performance of D-PSGD. In general, it is difficult to guarantee delay-free communications without any communication deterioration in real wireless network systems because of path loss and multi-path fading. These factors significantly degrade the runtime performance of D-PSGD. To alleviate such problems, we first analyze the runtime performance of D-PSGD by considering real wireless systems. This analysis yields the key insights that dense network topology (1) does not significantly gain the training accuracy of D-PSGD compared to sparse one, and (2) strongly degrades the runtime performance because this setting generally requires to utilize a low-rate transmission. Based on these findings, we propose a novel communication strategy, in which each node estimates optimal transmission rates such that communication time during the D-PSGD optimization is minimized under the constraint of network density, which is characterized by radio propagation property. The proposed strategy enables to improve the runtime performance of D-PSGD in wireless systems. Numerical simulations reveal that the proposed strategy is capable of enhancing the runtime performance of D-PSGD.


Asynchronous decentralized accelerated stochastic gradient descent

In this work, we introduce an asynchronous decentralized accelerated sto...

Asynchronous Decentralized Learning over Unreliable Wireless Networks

Decentralized learning enables edge users to collaboratively train model...

Decentralized Learning over Wireless Networks: The Effect of Broadcast with Random Access

In this work, we focus on the communication aspect of decentralized lear...

DeepSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression

Communication is a key bottleneck in distributed training. Recently, an ...

AirGNN: Graph Neural Network over the Air

Graph neural networks (GNNs) are information processing architectures th...

Decentralized SGD with Over-the-Air Computation

We study the performance of decentralized stochastic gradient descent (D...

Decentralized learning for wireless communications and networking

This chapter deals with decentralized learning algorithms for in-network...

Please sign up or login with your details

Forgot password? Click here to reset