Improving the Performance of a NoC-based CNN Accelerator with Gather Support

08/01/2021
by   Binayak Tiwari, et al.
0

The increasing application of deep learning technology drives the need for an efficient parallel computing architecture for Convolutional Neural Networks (CNNs). A significant challenge faced when designing a many-core CNN accelerator is to handle the data movement between the processing elements. The CNN workload introduces many-to-one traffic in addition to one-to-one and one-to-many traffic. As the de-facto standard for on-chip communication, Network-on-Chip (NoC) can support various unicast and multicast traffic. For many-to-one traffic, repetitive unicast is employed which is not an efficient way. In this paper, we propose to use the gather packet on mesh-based NoCs employing output stationary systolic array in support of many-to-one traffic. The gather packet will collect the data from the intermediate nodes eventually leading to the destination efficiently. This method is evaluated using the traffic traces generated from the convolution layer of AlexNet and VGG-16 with improvement in the latency and power than the repetitive unicast method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2021

Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural Network Acceleration

The increasing popularity of deep neural network (DNN) applications dema...
research
09/30/2018

Mini-batch Serialization: CNN Training with Inter-layer Data Reuse

Training convolutional neural networks (CNNs) requires intense computati...
research
04/06/2019

Ring-Mesh: A Scalable and High-Performance Approach for Manycore Accelerators

There is an increasing number of works addressing the design challenge o...
research
07/31/2023

PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge

Emerging deep neural network (DNN) applications require high-performance...
research
08/22/2023

Octopus: A Heterogeneous In-network Computing Accelerator Enabling Deep Learning for network

Deep learning (DL) for network models have achieved excellent performanc...
research
06/27/2021

OCCAM: Optimal Data Reuse for Convolutional Neural Networks

Convolutional neural networks (CNNs) are emerging as powerful tools for ...
research
07/06/2021

Energy-Efficient Accelerator Design for Deformable Convolution Networks

Deformable convolution networks (DCNs) proposed to address the image rec...

Please sign up or login with your details

Forgot password? Click here to reset