Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How

03/09/2018
by   Alberto Delmas, et al.
0

We show that, during inference with Convolutional Neural Networks (CNNs), more than 2x to 8x ineffectual work can be exposed if instead of targeting those weights and activations that are zero, we target different combinations of value stream properties. We demonstrate a practical application with Bit-Tactical (TCL), a hardware accelerator which exploits weight sparsity, per layer precision variability and dynamic fine-grain precision reduction for activations, and optionally the naturally occurring sparse effectual bit content of activations to improve performance and energy efficiency. TCL benefits both sparse and dense CNNs, natively supports both convolutional and fully-connected layers, and exploits properties of all activations to reduce storage, communication, and computation demands. While TCL does not require changes to the CNN to deliver benefits, it does reward any technique that would amplify any of the aforementioned weight and activation value properties. Compared to an equivalent data-parallel accelerator for dense CNNs, TCLp, a variant of TCL improves performance by 5.05x and is 2.98x more energy efficient while requiring 22

READ FULL TEXT

page 6

page 9

page 10

page 12

research
04/17/2018

DPRed: Making Typical Activation Values Matter In Deep Learning Computing

We show that selecting a fixed precision for all activations in Convolut...
research
05/23/2017

SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have emerged as a fundamental techn...
research
06/28/2023

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

The primary operation in DNNs is the dot product of quantized input acti...
research
01/07/2020

Sparse Weight Activation Training

Training convolutional neural networks (CNNs) is time-consuming. Prior w...
research
11/09/2021

Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Sparse convolutional neural networks (CNNs) have gained significant trac...
research
08/25/2020

IKW: Inter-Kernel Weights for Power Efficient Edge Computing

Deep Convolutional Neural Networks (CNN) have achieved state-of-the-art ...
research
10/20/2016

Bit-pragmatic Deep Neural Network Computing

We quantify a source of ineffectual computations when processing the mul...

Please sign up or login with your details

Forgot password? Click here to reset