DESCNet: Developing Efficient Scratchpad Memories for Capsule Network Hardware

by   Alberto Marchisio, et al.

Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently proposed by the Google Brain's team, the Capsule Networks (CapsNets) have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules and preserving the spatial relationship between different objects. However, they pose significantly high computational and memory requirements, making their energy-efficient inference a challenging task. This paper provides, for the first time, an in-depth analysis to highlight the design and management related challenges for the (on-chip) memories deployed in hardware accelerators executing fast CapsNets inference. To enable an efficient design, we propose an application-specific memory hierarchy, which minimizes the off-chip memory accesses, while efficiently feeding the data to the hardware accelerator. We analyze the corresponding on-chip memory requirements and leverage it to propose a novel methodology to explore different scratchpad memory designs and their energy/area trade-offs. Afterwards, an application-specific power-gating technique is proposed to further reduce the energy consumption, depending upon the utilization across different operations of the CapsNets. Our results for a selected Pareto-optimal solution demonstrate no performance loss and an energy reduction of 79 complete accelerator, including computational units and memories, when compared to a state-of-the-art design executing Google's CapsNet model for the MNIST dataset.


page 1

page 5

page 6

page 11

page 13

page 14

page 15


CapStore: Energy-Efficient Design and Management of the On-Chip Memory for CapsuleNet Inference Accelerators

Deep Neural Networks (DNNs) have been established as the state-of-the-ar...

CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse

Deep Neural Networks (DNNs) have been widely deployed for many Machine L...

Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference

The need to execute Deep Neural Networks (DNNs) at low latency and low p...

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Compute in-memory (CIM) is a promising technique that minimizes data tra...

SparseNN: An Energy-Efficient Neural Network Accelerator Exploiting Input and Output Sparsity

Contemporary Deep Neural Network (DNN) contains millions of synaptic con...

Memory-Oriented Design-Space Exploration of Edge-AI Hardware for XR Applications

Low-Power Edge-AI capabilities are essential for on-device extended real...

Please sign up or login with your details

Forgot password? Click here to reset