GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

11/16/2018
by   Yanping Huang, et al.
0

GPipe is a scalable pipeline parallelism library that enables learning of giant deep neural networks. It partitions network layers across accelerators and pipelines execution to achieve high hardware utilization. It leverages recomputation to minimize activation memory usage. For example, using partitions over 8 accelerators, it is able to train networks that are 25x larger, demonstrating its scalability. It also guarantees that the computed gradients remain consistent regardless of the number of partitions. It achieves an almost linear speed up without any changes in the model parameters: when using 4x more accelerators, training the same model is up to 3.5x faster. We train a 557 million parameters AmoebaNet model on ImageNet and achieve a new state-of-the-art 84.3 use this learned model as an initialization for training 7 different popular image classification datasets and obtain results that exceed the best published ones on 5 of them, including pushing the CIFAR-10 accuracy to 99 accuracy to 91.3

READ FULL TEXT
research
06/16/2020

Memory-Efficient Pipeline-Parallel DNN Training

Many state-of-the-art results in domains such as NLP and computer vision...
research
10/09/2019

PipeMare: Asynchronous Pipeline Parallel DNN Training

Recently there has been a flurry of interest around using pipeline paral...
research
03/25/2020

Pipelined Backpropagation at Scale: Training Large Models without Batches

Parallelism is crucial for accelerating the training of deep neural netw...
research
12/18/2019

Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators

This paper describes various design considerations for deep neural netwo...
research
03/05/2022

ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines

The use of FPGAs for efficient graph processing has attracted significan...
research
07/14/2021

Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines

Training large deep learning models at scale is very challenging. This p...
research
10/25/2021

Parameter Prediction for Unseen Deep Architectures

Deep learning has been successful in automating the design of features i...

Please sign up or login with your details

Forgot password? Click here to reset