clusterNOR: A NUMA-Optimized Clustering Framework

02/24/2019
by   Dia Mhembere, et al.
0

Clustering algorithms are iterative and have complex data access patterns that result in many small random memory accesses. The performance of parallel implementations suffer from synchronous barriers for each iteration and skewed workloads. We rethink the parallelization of clustering for modern non-uniform memory architectures (NUMA) to maximizes independent, asynchronous computation. We eliminate many barriers, reduce remote memory accesses, and maximize cache reuse. We implement the 'Clustering NUMA Optimized Routines' (clusterNOR) extensible parallel framework that provides algorithmic building blocks. The system is generic, we demonstrate nine modern clustering algorithms that have simple implementations. clusterNOR includes (i) in-memory, (ii) semi-external memory, and (iii) distributed memory execution, enabling computation for varying memory and hardware budgets. For algorithms that rely on Euclidean distance, clusterNOR defines an updated Elkan's triangle inequality pruning algorithm that uses asymptotically less memory so that it works on billion-point data sets. clusterNOR extends and expands the scope of the 'knor' library for k-means clustering by generalizing underlying principles, providing a uniform programming interface and expanding the scope to hierarchical and linear algebraic classes of algorithms. The compound effect of our optimizations is an order of magnitude improvement in speed over other state-of-the-art solutions, such as Spark's MLlib and Apple's Turi.

READ FULL TEXT

page 4

page 6

page 7

page 8

research
06/08/2021

ParChain: A Framework for Parallel Hierarchical Agglomerative Clustering using Nearest-Neighbor Chain

This paper studies the hierarchical clustering problem, where the goal i...
research
10/08/2018

POLO: a POLicy-based Optimization library

We present POLO --- a C++ library for large-scale parallel optimization ...
research
11/03/2022

Convex Clustering through MM: An Efficient Algorithm to Perform Hierarchical Clustering

Convex clustering is a modern method with both hierarchical and k-means ...
research
11/20/2019

Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Multicomputers

The minimum distance of a linear code is a key concept in information th...
research
09/17/2019

Leyenda: An Adaptive, Hybrid Sorting Algorithm for Large Scale Data with Limited Memory

Sorting is the one of the fundamental tasks of modern data management sy...
research
02/28/2022

Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching

Triangle count and local clustering coefficient are two core metrics for...
research
10/27/2017

External Memory Pipelining Made Easy With TPIE

When handling large datasets that exceed the capacity of the main memory...

Please sign up or login with your details

Forgot password? Click here to reset