CannyFS: Opportunistically Maximizing I/O Throughput Exploiting the Transactional Nature of Batch-Mode Data Processing

12/20/2016
by   Jessica Nettelblad, et al.
0

We introduce a user mode file system, CannyFS, that hides latency by assuming all I/O operations will succeed. The user mode process will in turn report errors, allowing proper cleanup and a repeated attempt to take place. We demonstrate benefits for the model tasks of extracting archives and removing directory trees in a real-life HPC environment, giving typical reductions in time use of over 80 This approach can be considered a view of HPC jobs and their I/O activity as transactions. In general, file systems lack clearly defined transaction semantics. Over time, the competing trends to add cache and maintain data integrity have resulted in different practical tradeoffs. High-performance computing is a special case where overall throughput demands are high. Latency can also be high, with non-local storage. In addition, a theoretically possible I/O error (like permission denied, loss of connection, exceeding disk quota) will frequently warrant the resubmission of a full job or task, rather than traditional error reporting or handling. Therefore, opportunistically treating each I/O operation as successful, and part of a larger transaction, can speed up some applications that do not leverage asynchronous I/O.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2022

DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage

High Performance Compute (HPC) clusters often produce intermediate files...
research
05/25/2021

Narwhal and Tusk: A DAG-based Mempool and Efficient BFT Consensus

We propose separating the task of transaction dissemination from transac...
research
03/14/2019

Architecture-Aware, High Performance Transaction for Persistent Memory

Byte-addressable non-volatile main memory (NVM) demands transactional me...
research
02/03/2018

JobPruner: A Machine Learning Assistant for Exploring Parameter Spaces in HPC Applications

High Performance Computing (HPC) applications are essential for scientis...
research
09/16/2020

A FaaS File System for Serverless Computing

Serverless computing with cloud functions is quickly gaining adoption, b...
research
09/01/2020

Transaction Pricing for Maximizing Throughput in a Sharded Blockchain Ledger

In this paper, we present a pricing mechanism that aligns incentives of ...

Please sign up or login with your details

Forgot password? Click here to reset