Dynasparse: Accelerating GNN Inference through Dynamic Sparsity Exploitation

03/22/2023
by   Bingyi Zhang, et al.
0

Graph Neural Network (GNN) inference is used in many real-world applications. Data sparsity in GNN inference, including sparsity in the input graph and the GNN model, offer opportunities to further speed up inference. Also, many pruning techniques have been proposed for model compression that increase the data sparsity of GNNs. We propose Dynasparse, a comprehensive hardware-software codesign on FPGA to accelerate GNN inference through dynamic sparsity exploitation. For this, we decouple the GNN computation kernels from the basic computation primitives, and explore hardware-software codesign as follows: 1) Hardware design: We propose a novel unified accelerator design on FPGA to efficiently execute various computation primitives. We develop a customized soft processor that is tightly coupled with the accelerator to execute a runtime system. Moreover, we develop efficient hardware mechanisms to profile the data sparsity and perform on-the-fly data format transformation to prepare the input data for various computation primitives; 2) Software design: We develop a runtime system that works synergistically with the accelerator to perform dynamic kernel-to-primitive mapping based on data sparsity. We implement Dynasparse on a state-of-the-art FPGA platform, Xilinx Alveo U250, and evaluate the design using widely used GNN models (GCN, GraphSAGE, GIN and SGC). For the above GNN models and various input graphs, the proposed accelerator and dynamic kernel-to-primitive mapping reduces the inference latency by 3.73× on the average compared with the static mapping strategies employed in the state-of-the-art GNN accelerators. Compared with state-of-the-art CPU (GPU) implementations, Dynasparse achieves up to 56.9× (2.37×) speedup in end-to-end latency.

READ FULL TEXT

page 1

page 3

page 6

research
08/04/2023

Exploiting On-chip Heterogeneity of Versal Architecture for GNN Inference Acceleration

Graph Neural Networks (GNNs) have revolutionized many Machine Learning (...
research
06/17/2022

Low-latency Mini-batch GNN Inference on CPU-FPGA Heterogeneous Platform

Mini-batch inference of Graph Neural Networks (GNNs) is a key problem in...
research
02/02/2023

GraphAGILE: An FPGA-based Overlay Accelerator for Low-latency GNN Inference

This paper presents GraphAGILE, a domain-specific FPGA-based overlay acc...
research
04/13/2023

DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference

Dynamic Graph Neural Networks (DGNNs) are becoming increasingly popular ...
research
01/04/2023

Accurate, Low-latency, Efficient SAR Automatic Target Recognition on FPGA

Synthetic aperture radar (SAR) automatic target recognition (ATR) is the...
research
05/11/2023

Graph Neural Network for Accurate and Low-complexity SAR ATR

Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is the...
research
03/10/2022

Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

Temporal Graph Neural Networks (TGNNs) are powerful models to capture te...

Please sign up or login with your details

Forgot password? Click here to reset