Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

04/28/2022
by   Milos Nikolic, et al.
0

We introduce a software-hardware co-design approach to reduce memory traffic and footprint during training with BFloat16 or FP32 boosting energy efficiency and execution time performance. We introduce methods to dynamically adjust the size and format of the floating-point containers used to store activations and weights during training. The different value distributions lead us to different approaches for exponents and mantissas. Gecko exploits the favourable exponent distribution with a loss-less delta encoding approach to reduce the total exponent footprint by up to 58% in comparison to a 32 bit floating point baseline. To content with the noisy mantissa distributions, we present two lossy methods to eliminate as many as possible least significant bits while not affecting accuracy. Quantum Mantissa, is a machine learning-first mantissa compression method that taps on training's gradient descent algorithm to also learn minimal mantissa bitlengths on a per-layer granularity, and obtain up to 92% reduction in total mantissa footprint. Alternatively, BitChop observes changes in the loss function during training to adjust mantissa bit-length network-wide yielding a reduction of 81% in footprint. Schrödinger's FP implements hardware encoders/decoders that guided by Gecko/Quantum Mantissa or Gecko/BitChop transparently encode/decode values when transferring to/from off-chip memory boosting energy efficiency and reducing execution time.

READ FULL TEXT

page 1

page 4

research
04/15/2021

All-You-Can-Fit 8-Bit Flexible Floating-Point Format for Accurate and Memory-Efficient Inference of Deep Neural Networks

Modern deep neural network (DNN) models generally require a huge amount ...
research
01/30/2023

Self-Compressing Neural Networks

This work focuses on reducing neural network size, which is a major driv...
research
02/04/2021

EFloat: Entropy-coded Floating Point Format for Deep Learning

We describe the EFloat floating-point number format with 4 to 6 addition...
research
11/28/2018

Predicting the Computational Cost of Deep Learning Models

Deep learning is rapidly becoming a go-to tool for many artificial intel...
research
04/04/2019

Regularizing Activation Distribution for Training Binarized Deep Networks

Binarized Neural Networks (BNNs) can significantly reduce the inference ...
research
03/23/2022

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Increasingly larger and better Transformer models keep advancing state-o...
research
08/16/2023

Towards Zero Memory Footprint Spiking Neural Network Training

Biologically-inspired Spiking Neural Networks (SNNs), processing informa...

Please sign up or login with your details

Forgot password? Click here to reset