Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning

02/07/2020
by   Rui Liu, et al.
3

As neural networks are increasingly employed in machine learning practice, organizations will have to determine how to share limited training resources among a diverse set of model training tasks. This paper studies jointly training multiple neural network models on a single GPU. We presents an empirical study of this operation, called pack, and end-to-end experiments that suggest significant improvements for hyperparameter search systems. Our research prototype is in TensorFlow, and we evaluate performance across different models (ResNet, MobileNet, DenseNet, and MLP) and training scenarios. The results suggest: (1) packing two models can bring up to 40 improvement over unpacked setups for a single training step and the improvement increases when packing more models; (2) the benefit of a pack primitive largely depends on a number of factors including memory capacity, chip architecture, neural network structure, and batch size; (3) there exists a trade-off between packing and unpacking when training multiple neural network models on limited resources; (4) a pack-based Hyperband is up to 2.7x faster than the original Hyperband training method in our experiment setting, with this improvement growing as memory size increases and subsequently the density of models packed.

READ FULL TEXT
research
01/21/2021

ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction

Deep neural networks have usually to be compressed and accelerated for t...
research
02/14/2019

Superposition of many models into one

We present a method for storing multiple models within a single set of p...
research
05/31/2020

Crossed-Time Delay Neural Network for Speaker Recognition

Time Delay Neural Network (TDNN) is a well-performing structure for DNN-...
research
06/20/2017

Optimal modularity and memory capacity of neural networks

The neural network is a powerful computing framework that has been explo...
research
10/21/2018

Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training

Training neural network often uses a machine learning framework such as ...
research
04/27/2023

Moccasin: Efficient Tensor Rematerialization for Neural Networks

The deployment and training of neural networks on edge computing devices...
research
04/22/2022

Efficient Training of Neural Transducer for Speech Recognition

As one of the most popular sequence-to-sequence modeling approaches for ...

Please sign up or login with your details

Forgot password? Click here to reset