Training DNN IoT Applications for Deployment On Analog NVM Crossbars

10/30/2019
by   Fernando García-Redondo, et al.
0

Deep Neural Networks (DNN) applications are increasingly being deployed in always-on IoT devices. However, the limited resources in tiny microcontroller units (MCUs) limit the deployment of the required Machine Learning (ML) models. Therefore alternatives to traditional architectures such as Computation-In-Memory based on resistive nonvolatile memories (NVM), promising high integration density, low power consumption and massively-parallel computation capabilities, are under study. However, these technologies are still immature and suffer from intrinsic analog nature problems –noise, non-linearities, inability to represent negative values, and limited-precision per device. Consequently, mapping DNNs to NVM crossbars requires the full-custom design of each one of the DNN layers, involving finely tuned blocks such as ADC/DACs or current subtractors/adders, and thus limiting the chip reconfigurability. This paper presents an NVM-aware framework to efficiently train and map the DNN to the NVM hardware. We propose the first method that trains the NN weights while ensuring uniformity across layer weights/activations, improving HW blocks re-usability. Firstly, this quantization algorithm obtains uniform scaling across the DNN layers independently of their characteristics, removing the need of per-layer full-custom design while reducing the peripheral HW. Secondly, for certain applications we make use of Network Architecture Search, to avoid using negative weights. Unipolar weight matrices translate into simpler analog periphery and lead to 67 % area improvement and up to 40 % power reduction. We validate our idea with CIFAR10 and HAR applications by mapping to crossbars using 4-bit and 2-bit devices. Up to 92.91% accuracy (95% floating-point) can be achieved using 2-bit only-positive weights for HAR.

READ FULL TEXT
research
09/18/2019

Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment

Emerging resistive random-access memory (ReRAM) has recently been intens...
research
06/28/2023

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

The primary operation in DNNs is the dot product of quantized input acti...
research
03/02/2020

A New MRAM-based Process In-Memory Accelerator for Efficient Neural Network Training with Floating Point Precision

The excellent performance of modern deep neural networks (DNNs) comes at...
research
02/06/2022

Energy awareness in low precision neural networks

Power consumption is a major obstacle in the deployment of deep neural n...
research
11/27/2019

Representable Matrices: Enabling High Accuracy Analog Computation for Inference of DNNs using Memristors

Analog computing based on memristor technology is a promising solution t...
research
06/16/2021

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

Recent works demonstrated the promise of using resistive random access m...
research
06/22/2019

Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars

Neural processor development is reducing our reliance on remote server a...

Please sign up or login with your details

Forgot password? Click here to reset