ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines

by   Xinyu Chen, et al.

The use of FPGAs for efficient graph processing has attracted significant interest. Recent memory subsystem upgrades including the introduction of HBM in FPGAs promise to further alleviate memory bottlenecks. However, modern multi-channel HBM requires much more processing pipelines to fully utilize its bandwidth potential. Existing designs do not scale well, resulting in underutilization of the HBM facilities even when all other resources are fully consumed. In this paper, we re-examined the graph processing workloads and found much diversity in processing. We also found that the diverse workloads can be easily classified into two types, namely dense and sparse partitions. This motivates us to propose a resource-efficient heterogeneous pipeline architecture. Our heterogeneous architecture comprises of two types of pipelines: Little pipelines to process dense partitions with good locality and Big pipelines to process sparse partitions with the extremely poor locality. Unlike traditional monolithic pipeline designs, the heterogeneous pipelines are tailored for more specific memory access patterns, and hence are more lightweight, allowing the architecture to scale up to more effectively with limited resources. In addition, we propose a model-guided task scheduling method that schedules partitions to the right pipeline types, generates the most efficient pipeline combination and balances workloads. Furthermore, we develop an automated open-source framework, called ReGraph, which automates the entire development process. ReGraph outperforms state-of-the-art FPGA accelerators by up to 5.9 times in terms of performance and 12times in terms of resource efficiency.


page 1

page 4

page 7

page 9

page 11


Disaggregating Non-Volatile Memory for Throughput-Oriented Genomics Workloads

Massive exploitation of next-generation sequencing technologies requires...

Dataset Lifecycle Framework and its applications in Bioinformatics

Bioinformatics pipelines depend on shared POSIX filesystems for its inpu...

Kugelblitz: Streamlining Reconfigurable Packet Processing Pipeline Design and Evaluation

Reconfigurable packet processing pipelines have emerged as a common buil...

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

GPipe is a scalable pipeline parallelism library that enables learning o...

HEGrid: A High Efficient Multi-Channel Radio Astronomical Data Gridding Framework in Heterogeneous Computing Environments

The challenge to fully exploit the potential of existing and upcoming sc...

Progressive Structure from Motion

Structure from Motion or the sparse 3D reconstruction out of individual ...

An Efficient Dispatcher for Large Scale GraphProcessing on OpenCL-based FPGAs

High parallel framework has been proved to be very suitable for graph pr...

Please sign up or login with your details

Forgot password? Click here to reset