FPGA deep learning acceleration based on convolutional neural network
In view of the large amount of calculation and long calculation time of convolutional neural network (CNN), this paper proposes a convolutional neural network hardware accelerator based on field programmable logic gate array (FPGA). First, through in-depth analysis of the forward operation principle of the convolutional layer and exploration of the parallelism of the convolutional layer operation, a hardware architecture of input channel parallelism, output channel parallelism and convolution window deep pipeline is designed. Then in the above architecture, a fully parallel multiplication-addition tree module is designed to accelerate the convolution operation and an efficient window buffer module to implement the pipeline operation of the convolution window. The final experimental results show that the energy efficiency ratio of the accelerator proposed in this article reaches 32.73 GOPS/W, which is 34 existing solution, and the performance reaches 317.86 GOPS.
READ FULL TEXT