Large Scale Parallelization Using File-Based Communications

09/03/2019
by   Chansup Byun, et al.
0

In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and receiving processes are not on the same node. However, even with this additional overhead cost, its benefits are far greater for the overall cluster operation in addition to the performance enhancement in message communications for large scale parallel jobs. For example, when running a 2048-process parallel job, it achieved about 34 times better performance with MPI_Bcast() when using the local filesystem. Furthermore, since the security for transferring message files is handled entirely by using the secure copy protocol (scp) and the file system permissions, no additional security measures or ports are required other than those that are typically required on an HPC system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2021

Node-Based Job Scheduling for Large Scale Simulations of Short Running Jobs

Diverse workloads such as interactive supercomputing, big data analysis,...
research
08/03/2018

A Stochastic Model for File Lifetime and Security in Data Center Networks

Data center networks are an important infrastructure in various applicat...
research
12/06/2022

DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage

High Performance Compute (HPC) clusters often produce intermediate files...
research
08/05/2020

Best of Both Worlds: High Performance Interactive and Batch Launching

Rapid launch of thousands of jobs is essential for effective interactive...
research
02/08/2018

A local parallel communication algorithm for polydisperse rigid body dynamics

The simulation of large ensembles of particles is usually parallelized b...
research
07/25/2018

PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies

The current landscape of scientific research is widely based on modeling...
research
08/17/2023

Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching

Gzip is a file compression format, which is ubiquitously used. Although ...

Please sign up or login with your details

Forgot password? Click here to reset