Low-rank Gradient Approximation For Memory-Efficient On-device Training of Deep Neural Network

01/24/2020
by   Mary Gooneratne, et al.
Google
Duke University
0

Training machine learning models on mobile devices has the potential of improving both privacy and accuracy of the models. However, one of the major obstacles to achieving this goal is the memory limitation of mobile devices. Reducing training memory enables models with high-dimensional weight matrices, like automatic speech recognition (ASR) models, to be trained on-device. In this paper, we propose approximating the gradient matrices of deep neural networks using a low-rank parameterization as an avenue to save training memory. The low-rank gradient approximation enables more advanced, memory-intensive optimization techniques to be run on device. Our experimental results show that we can reduce the training memory by about 33.0 optimization. It uses comparable memory to momentum optimization and achieves a 4.5

READ FULL TEXT

page 1

page 2

page 3

page 4

09/08/2020

Low-Rank Training of Deep Neural Networks for Emerging Memory Technology

The recent success of neural networks for solving difficult decision tal...
09/27/2022

Exploring Low Rank Training of Deep Neural Networks

Training deep neural networks in low rank, i.e. with factorised layers, ...
08/02/2017

ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections

Deep neural networks have become ubiquitous for applications related to ...
10/06/2015

Structured Transforms for Small-Footprint Deep Learning

We consider the task of building compact deep learning pipelines suitabl...
10/02/2018

Training compact deep learning models for video classification using circulant matrices

In real world scenarios, model accuracy is hardly the only factor to con...
09/20/2018

Data Shuffling in Wireless Distributed Computing via Low-Rank Optimization

Intelligent mobile platforms such as smart vehicles and drones have rece...
03/10/2022

projUNN: efficient method for training deep networks with unitary matrices

In learning with recurrent or very deep feed-forward networks, employing...

Please sign up or login with your details

Forgot password? Click here to reset