PMV: Pre-partitioned Generalized Matrix-Vector Multiplication for Scalable Graph Mining

09/26/2017
by   Chiwan Park, et al.
0

How can we analyze enormous networks including the Web and social networks which have hundreds of billions of nodes and edges? Network analyses have been conducted by various graph mining methods including shortest path computation, PageRank, connected component computation, random walk with restart, etc. These graph mining methods can be expressed as generalized matrix-vector multiplication which consists of few operations inspired by typical matrix-vector multiplication. Recently, several graph processing systems based on matrix-vector multiplication or their own primitives have been proposed to deal with large graphs; however, they all have failed on Web-scale graphs due to insufficient memory space or the lack of consideration for I/O costs. In this paper, we propose PMV (Pre-partitioned generalized Matrix-Vector multiplication), a scalable distributed graph mining method based on generalized matrix-vector multiplication on distributed systems. PMV significantly decreases the communication cost, which is the main bottleneck of distributed systems, by partitioning the input graph in advance and judiciously applying execution strategies based on the density of the pre-partitioned sub-matrices. Experiments show that PMV succeeds in processing up to 16x larger graphs than existing distributed memory-based graph mining methods, and requires 9x less time than previous disk-based graph mining methods by reducing I/O costs significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2021

On Fast Computation of a Circulant Matrix-Vector Product

This paper deals with circulant matrices. It is shown that a circulant m...
research
04/27/2018

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Large-scale machine learning and data mining applications require comput...
research
08/25/2023

SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix Multiplication

Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamenta...
research
09/12/2023

Ensemble Mask Networks

Can an ℝ^n→ℝ^n feedforward network learn matrix-vector multiplication? T...
research
03/28/2022

AWAPart: Adaptive Workload-Aware Partitioning of Knowledge Graphs

Large-scale knowledge graphs are increasingly common in many domains. Th...
research
01/25/2019

Distributed Matrix-Vector Multiplication: A Convolutional Coding Approach

Distributed computing systems are well-known to suffer from the problem ...
research
12/25/2021

On computing HITS ExpertRank via lumping the hub matrix

The dangling nodes is the nodes with no out-links in the web graph. It s...

Please sign up or login with your details

Forgot password? Click here to reset