RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms

by   Aymen Alsaadi, et al.

Executing scientific workflows with heterogeneous tasks on HPC platforms poses several challenges which will be further exacerbated by the upcoming exascale platforms. At that scale, bespoke solutions will not enable effective and efficient workflow executions. In preparation, we need to look at ways to manage engineering effort and capability duplication across software systems by integrating independently developed, production-grade software solutions. In this paper, we integrate RADICAL-Pilot (RP) and Parsl and develop an MPI executor to enable the execution of workflows with heterogeneous (non)MPI Python functions at scale. We characterize the strong and weak scaling of the integrated RP-Parsl system when executing two use cases from polar science, and of the function executor on both SDSC Comet and TACC Frontera. We gain engineering insight about how to analyze and integrate workflow and runtime systems, minimizing changes in their code bases and overall development effort. Our experiments show that the overheads of the integrated system are invariant of resource and workflow scale, and measure the impact of diverse MPI overheads. Together, those results define a blueprint towards an ecosystem populated by specialized, efficient, effective and independently-maintained software systems to face the upcoming scaling challenges.


Design and Performance Characterization of RADICAL-Pilot on Leadership-class Platforms

Many extreme scale scientific applications have workloads comprised of a...

Characterizing the Performance of Executing Many-tasks on Summit

Many scientific workloads are comprised of many tasks, where each task i...

Design and Performance Characterization of RADICAL-Pilot on Titan

Many extreme scale scientific applications have workloads comprised of a...

Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability

Modern large-scale scientific discovery requires multidisciplinary colla...

Extreme Scale Survey Simulation with Python Workflows

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) wil...

Workflows Community Summit: Tightening the Integration between Computing Facilities and Scientific Workflows

The importance of workflows is highlighted by the fact that they have un...

Challenges of Translating HPC codes to Workflows for Heterogeneous and Dynamic Environments

In this paper we would like to share our experience for transforming a p...

Please sign up or login with your details

Forgot password? Click here to reset