BitNet: Bit-Regularized Deep Neural Networks

08/16/2017
by   Aswin Raghavan, et al.
0

We present a novel regularization scheme for training deep neural networks. The parameters of neural networks are usually unconstrained and have a dynamic range dispersed over the real line. Our key idea is to control the expressive power of the network by dynamically quantizing the range and set of values that the parameters can take. We formulate this idea using a novel end-to-end approach that regularizes the traditional classification loss function. Our regularizer is inspired by the Minimum Description Length principle. For each layer of the network, our approach optimizes a translation and scaling factor along with integer-valued parameters. We empirically compare BitNet to an equivalent unregularized model on the MNIST and CIFAR-10 datasets. We show that BitNet converges faster to a superior quality solution. Additionally, the resulting model is significantly smaller in size due to the use of integer parameters instead of floats.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2018

Generalized Ternary Connect: End-to-End Learning and Compression of Multiplication-Free Deep Neural Networks

The use of deep neural networks in edge computing devices hinges on the ...
research
11/11/2019

Structural Pruning in Deep Neural Networks: A Small-World Approach

Deep Neural Networks (DNNs) are usually over-parameterized, causing exce...
research
04/13/2019

Shakeout: A New Approach to Regularized Deep Neural Network Training

Recent years have witnessed the success of deep neural networks in deali...
research
11/24/2016

Survey of Expressivity in Deep Neural Networks

We survey results on neural network expressivity described in "On the Ex...
research
04/29/2018

SHADE: Information-Based Regularization for Deep Learning

Regularization is a big issue for training deep neural networks. In this...
research
07/19/2016

Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off

We present a novel dynamic configuration technique for deep neural netwo...
research
10/21/2017

Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference

We propose the use of incomplete dot products (IDP) to dynamically adjus...

Please sign up or login with your details

Forgot password? Click here to reset