SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix Multiplication

08/25/2023
by   Myung-Hwan Jang, et al.
0

Sparse generalized matrix-matrix multiplication (SpGEMM) is a fundamental operation for real-world network analysis. With the increasing size of real-world networks, the single-machine-based SpGEMM approach cannot perform SpGEMM on large-scale networks, exceeding the size of main memory (i.e., not scalable). Although the distributed-system-based approach could handle large-scale SpGEMM based on multiple machines, it suffers from severe inter-machine communication overhead to aggregate results of multiple machines (i.e., not efficient). To address this dilemma, in this paper, we propose a novel storage-based SpGEMM approach (SAGE) that stores given networks in storage (e.g., SSD) and loads only the necessary parts of the networks into main memory when they are required for processing via a 3-layer architecture. Furthermore, we point out three challenges that could degrade the overall performance of SAGE and propose three effective strategies to address them: (1) block-based workload allocation for balancing workloads across threads, (2) in-memory partial aggregation for reducing the amount of unnecessarily generated storage-memory I/Os, and (3) distribution-aware memory allocation for preventing unexpected buffer overflows in main memory. Via extensive evaluation, we verify the superiority of SAGE over existing SpGEMM methods in terms of scalability and efficiency.

READ FULL TEXT
research
05/27/2021

Efficient distributed algorithms for Convolutional Neural Networks

Several efficient distributed algorithms have been developed for matrix-...
research
02/06/2020

Product Kanerva Machines: Factorized Bayesian Memory

An ideal cognitively-inspired memory system would compress and organize ...
research
10/16/2020

Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale

Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in ...
research
05/05/2019

MapReduce Meets Fine-Grained Complexity: MapReduce Algorithms for APSP, Matrix Multiplication, 3-SUM, and Beyond

Distributed processing frameworks, such as MapReduce, Hadoop, and Spark ...
research
09/26/2017

PMV: Pre-partitioned Generalized Matrix-Vector Multiplication for Scalable Graph Mining

How can we analyze enormous networks including the Web and social networ...
research
03/13/2019

GNA: new framework for statistical data analysis

We report on the status of GNA — a new framework for fitting large-scale...
research
02/20/2020

SpArch: Efficient Architecture for Sparse Matrix Multiplication

Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous...

Please sign up or login with your details

Forgot password? Click here to reset