High-Performance Massive Subgraph Counting using Pipelined Adaptive-Group Communication

by   Langshi Chen, et al.
Indiana University Bloomington
Indiana University
Virginia Polytechnic Institute and State University

Subgraph counting aims to count the number of occurrences of a subgraph T (aka as a template) in a given graph G. The basic problem has found applications in diverse domains. The problem is known to be computationally challenging - the complexity grows both as a function of T and G. Recent applications have motivated solving such problems on massive networks with billions of vertices. In this chapter, we study the subgraph counting problem from a parallel computing perspective. We discuss efficient parallel algorithms for approximately resolving subgraph counting problems by using the color-coding technique. We then present several system-level strategies to substantially improve the overall performance of the algorithm in massive subgraph counting problems. We propose: 1) a novel pipelined Adaptive-Group communication pattern to improve inter-node scalability, 2) a fine-grained pipeline design to effectively reduce the memory space of intermediate results, 3) partitioning neighbor lists of subgraph vertices to achieve better thread concurrency and workload balance. Experimentation on an Intel Xeon E5 cluster shows that our implementation achieves 5x speedup of performance compared to the state-of-the-art work while reduces the peak memory utilization by a factor of 2 on large templates of 12 to 15 vertices and input graphs of 2 to 5 billions of edges.


A GraphBLAS Approach for Subgraph Counting

Subgraph counting aims to count the occurrences of a subgraph template T...

Fast and Robust Distributed Subgraph Enumeration

We study the classic subgraph enumeration problem under distributed sett...

SubGraph2Vec: Highly-Vectorized Tree-likeSubgraph Counting

Subgraph counting aims to count occurrences of a template T in a given n...

Intel Optane DCPMM and Serverless Computing

This report describes 1) how we use Intel's Optane DCPMM in the memory M...

Space-Query Tradeoffs in Range Subgraph Counting and Listing

This paper initializes the study of range subgraph counting and range su...

Parallel Algorithms for Butterfly Computations

Butterflies are the smallest non-trivial subgraph in bipartite graphs, a...

A Survey on Subgraph Counting: Concepts, Algorithms and Applications to Network Motifs and Graphlets

Computing subgraph frequencies is a fundamental task that lies at the co...

Please sign up or login with your details

Forgot password? Click here to reset