Rack-Aware Regenerating Codes with Multiple Erasure Tolerance

06/07/2021
by   Liyang Zhou, et al.
0

In a modern distributed storage system, storage nodes are organized in racks, and the cross-rack communication dominates the system bandwidth. In this paper, we focus on the rack-aware storage system. The initial setting was immediately repairing every single node failure. However, multiple node failures are frequent, and some systems may even wait for multiple nodes failures to occur before repairing them in order to keep costs down. For the purpose of still being able to repair them properly when multiple failures occur, we relax the repair model of the rack-aware storage system. In the repair process, the cross-rack connections (i.e., the number of helper racks connected for repair which is called repair degree) and the intra-rack connections (i.e., the number of helper nodes in the rack contains the failed node) are all reduced. We focus on minimizing the cross-rack bandwidth in the rack-aware storage system with multiple erasure tolerances. First, the fundamental tradeoff between the repair bandwidth and the storage size for functional repair is established. Then, the two extreme points corresponding to the minimum storage and minimum cross-rack repair bandwidth are obtained. Second, the explicitly construct corresponding to the two points are given. Both of them have minimum sub-packetization level (i.e., the number of symbols stored in each node) and small repair degree. Besides, the size of underlying finite field is approximately the block length of the code. Finally, for the convenience of practical use, we also establish a transformation to convert our codes into systematic codes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2021

Explicit Construction of Minimum Bandwidth Rack-Aware Regenerating Codes

In large data centers, storage nodes are organized in racks, and the cro...
research
12/10/2021

Optimal Quaternary (r,delta)-Locally Repairable Codes Achieving the Singleton-type Bound

Locally repairable codes enables fast repair of node failure in a distri...
research
01/21/2021

Rack-Aware Regenerating Codes with Fewer Helper Racks

We consider the rack-aware storage system where n nodes are organized in...
research
10/16/2017

LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead

Distributed storage systems suffer from significant repair traffic gener...
research
04/08/2020

Deterministic Data Distribution for Efficient Recovery in Erasure-Coded Storage Systems

Due to individual unreliable commodity components, failures are common i...
research
11/02/2022

Node repair on connected graphs, Part II

We continue our study of regenerating codes in distributed storage syste...
research
06/16/2020

Multilinear Algebra for Distributed Storage

An (n, k, d, α, β, M)-ERRC (exact-repair regenerating code) is a collect...

Please sign up or login with your details

Forgot password? Click here to reset