Enabling On-Device Smartphone GPU based Training: Lessons Learned

02/21/2022
by   Anish Das, et al.
0

Deep Learning (DL) has shown impressive performance in many mobile applications. Most existing works have focused on reducing the computational and resource overheads of running Deep Neural Networks (DNN) inference on resource-constrained mobile devices. However, the other aspect of DNN operations, i.e. training (forward and backward passes) on smartphone GPUs, has received little attention thus far. To this end, we conduct an initial analysis to examine the feasibility of on-device training on smartphones using mobile GPUs. We first employ the open-source mobile DL framework (MNN) and its OpenCL backend for running compute kernels on GPUs. Next, we observed that training on CPUs is much faster than on GPUs and identified two possible bottlenecks related to this observation: (i) computation and (ii) memory bottlenecks. To solve the computation bottleneck, we optimize the OpenCL backend's kernels, showing 2x improvements (40-70 GFLOPs) over CPUs (15-30 GFLOPs) on the Snapdragon 8 series processors. However, we find that the full DNN training is still much slower on GPUs than on CPUs, indicating that memory bottleneck plays a significant role in the lower performance of GPU over CPU. The data movement takes almost 91 the findings and failures during our investigation, we present limitations and practical guidelines for future directions.

READ FULL TEXT

page 1

page 2

page 3

page 5

research
12/20/2013

Multi-GPU Training of ConvNets

In this work we evaluate different approaches to parallelize computation...
research
10/15/2019

Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing

Edge computing and IoT applications are severely constrained by limited ...
research
06/09/2022

Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs

The need to train DNN models on end-user devices (e.g., smartphones) is ...
research
07/03/2019

On-Device Neural Net Inference with Mobile GPUs

On-device inference of machine learning models for mobile phones is desi...
research
01/03/2019

HG-Caffe: Mobile and Embedded Neural Network GPU (OpenCL) Inference Engine with FP16 Supporting

Breakthroughs in the fields of deep learning and mobile system-on-chips ...
research
05/28/2020

Brief Announcement: On the Limits of Parallelizing Convolutional Neural Networks on GPUs

GPUs are currently the platform of choice for training neural networks. ...
research
11/02/2015

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Deep Neural Networks (DNN) have achieved state-of-the-art results in a w...

Please sign up or login with your details

Forgot password? Click here to reset