Checkpointing and Localized Recovery for Nested Fork-Join Programs

02/25/2021
by   Claudia Fohry, et al.
0

While checkpointing is typically combined with a restart of the whole application, localized recovery permits all but the affected processes to continue. In task-based cluster programming, for instance, the application can then be finished on the intact nodes, and the lost tasks be reassigned. This extended abstract suggests to adapt a checkpointing and localized recovery technique that has originally been developed for independent tasks to nested fork-join programs. We consider a Cilk-like work stealing scheme with work-first policy in a distributed memory setting, and describe the required algorithmic changes. The original technique has checkpointing overheads below 1 similar performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

ReStore: In-Memory REplicated STORagE for Rapid Recovery in Fault-Tolerant Algorithms

Fault-tolerant distributed applications require mechanisms to recover da...
research
06/21/2022

Graphical Join: A New Physical Join Algorithm for RDBMSs

Join operations (especially n-way, many-to-many joins) are known to be t...
research
08/24/2021

Constraining Localized Vote Tampering in the 2020 US Presidential Election

Voter fraud in the United States is rare and the vote-counting system is...
research
11/16/2021

The Case for Learned In-Memory Joins

In-memory join is an essential operator in any database engine. It has b...
research
07/27/2017

Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach

Compared to basic fork-join queues, a job in (n, k) fork-join queues onl...
research
05/21/2022

Experiences with task-based programming using cluster nodes as OpenMP devices

Programming a distributed system, such as a cluster, requires extended u...
research
04/13/2021

Nested Conformal Prediction Sets for Classification with Applications to Probation Data

Risk assessments to help inform criminal justice decisions have been use...

Please sign up or login with your details

Forgot password? Click here to reset