FastSGD: A Fast Compressed SGD Framework for Distributed Machine Learning

by   Keyu Yang, et al.

With the rapid increase of big data, distributed Machine Learning (ML) has been widely applied in training large-scale models. Stochastic Gradient Descent (SGD) is arguably the workhorse algorithm of ML. Distributed ML models trained by SGD involve large amounts of gradient communication, which limits the scalability of distributed ML. Thus, it is important to compress the gradients for reducing communication. In this paper, we propose FastSGD, a Fast compressed SGD framework for distributed ML. To achieve a high compression ratio at a low cost, FastSGD represents the gradients as key-value pairs, and compresses both the gradient keys and values in linear time complexity. For the gradient value compression, FastSGD first uses a reciprocal mapper to transform original values into reciprocal values, and then, it utilizes a logarithm quantization to further reduce reciprocal values to small integers. Finally, FastSGD filters reduced gradient integers by a given threshold. For the gradient key compression, FastSGD provides an adaptive fine-grained delta encoding method to store gradient keys with fewer bits. Extensive experiments on practical ML models and datasets demonstrate that FastSGD achieves the compression ratio up to 4 orders of magnitude, and accelerates the convergence time up to 8x, compared with state-of-the-art methods.


page 2

page 3

page 4

page 5

page 6

page 8

page 9

page 12


S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

Distributed stochastic gradient descent (SGD) approach has been widely u...

Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization

Due to the explosion in the size of the training datasets, distributed l...

3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning

The performance and efficiency of distributed machine learning (ML) depe...

Compressed Key Sort and Fast Index Reconstruction

In this paper we propose an index key compression scheme based on the no...

Private and Communication-Efficient Edge Learning: A Sparse Differential Gaussian-Masking Distributed SGD Approach

With rise of machine learning (ML) and the proliferation of smart mobile...

Joint Coreset Construction and Quantization for Distributed Machine Learning

Coresets are small, weighted summaries of larger datasets, aiming at pro...

On Efficient Constructions of Checkpoints

Efficient construction of checkpoints/snapshots is a critical tool for t...

Please sign up or login with your details

Forgot password? Click here to reset