ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network

05/14/2020
by   David Gschwend, et al.
0

Image Understanding is becoming a vital feature in ever more applications ranging from medical diagnostics to autonomous vehicles. Many applications demand for embedded solutions that integrate into existing systems with tight real-time and power constraints. Convolutional Neural Networks (CNNs) presently achieve record-breaking accuracies in all image understanding benchmarks, but have a very high computational complexity. Embedded CNNs thus call for small and efficient, yet very powerful computing platforms. This master thesis explores the potential of FPGA-based CNN acceleration and demonstrates a fully functional proof-of-concept CNN implementation on a Zynq System-on-Chip. The ZynqNet Embedded CNN is designed for image classification on ImageNet and consists of ZynqNet CNN, an optimized and customized CNN topology, and the ZynqNet FPGA Accelerator, an FPGA-based architecture for its evaluation. ZynqNet CNN is a highly efficient CNN topology. Detailed analysis and optimization of prior topologies using the custom-designed Netscope CNN Analyzer have enabled a CNN with 84.5 complexity of only 530 million multiplyaccumulate operations. The topology is highly regular and consists exclusively of convolutional layers, ReLU nonlinearities and one global pooling layer. The CNN fits ideally onto the FPGA accelerator. The ZynqNet FPGA Accelerator allows an efficient evaluation of ZynqNet CNN. It accelerates the full network based on a nested-loop algorithm which minimizes the number of arithmetic operations and memory accesses. The FPGA accelerator has been synthesized using High-Level Synthesis for the Xilinx Zynq XC-7Z045, and reaches a clock frequency of 200MHz with a device utilization of 80

READ FULL TEXT

page 19

page 32

page 42

research
09/03/2018

A CNN Accelerator on FPGA Using Depthwise Separable Convolution

Convolutional neural networks (CNNs) have been widely deployed in the fi...
research
11/15/2019

TinyCNN: A Tiny Modular CNN Accelerator for Embedded FPGA

In recent years, Convolutional Neural Network (CNN) based methods have a...
research
12/22/2020

FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations

Binary neural networks (BNNs) have 1-bit weights and activations. Such n...
research
07/20/2020

HPIPE: Heterogeneous Layer-Pipelined and Sparse-Aware CNN Inference for FPGAs

We present both a novel Convolutional Neural Network (CNN) accelerator a...
research
05/06/2018

SqueezeJet: High-level Synthesis Accelerator Design for Deep Convolutional Neural Networks

Deep convolutional neural networks have dominated the pattern recognitio...
research
04/30/2018

Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt for Mobile and Embedded Applications

Computer vision performances have been significantly improved in recent ...
research
03/21/2017

A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA

Deep Neural Networks are becoming the de-facto standard models for image...

Please sign up or login with your details

Forgot password? Click here to reset