LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead

by   Hyegyeong Park, et al.

Distributed storage systems suffer from significant repair traffic generated due to frequent storage node failures. This paper shows that properly designed low-density parity-check (LDPC) codes can substantially reduce the amount of required block downloads for repair thanks to the sparse nature of their factor graph representation. In particular, with a careful construction of the factor graph, both low repair-bandwidth and high reliability can be achieved for a given code rate. First, a formula for the average repair bandwidth of LDPC codes is developed. This formula is then used to establish that the minimum repair bandwidth can be achieved by forcing a regular check node degree in the factor graph. Moreover, it is shown that given a fixed code rate, the variable node degree should also be regular to yield minimum repair bandwidth, under some reasonable minimum variable node degree constraint. It is also shown that for a given repair-bandwidth requirement, LDPC codes can yield substantially higher reliability than currently utilized Reed-Solomon (RS) codes. Our reliability analysis is based on a formulation of the general equation for the mean-time-to-data-loss (MTTDL) associated with LDPC codes. The formulation reveals that the stopping number is closely related to the MTTDL. It is further shown that LDPC codes can be designed such that a small loss of repair-bandwidth optimality may be traded for a large improvement in erasure-correction capability and thus the MTTDL.


page 1

page 2

page 3

page 4


Explicit Construction of Minimum Bandwidth Rack-Aware Regenerating Codes

In large data centers, storage nodes are organized in racks, and the cro...

Determinant Codes with Helper-Independent Repair for Single and Multiple Failures

Determinant codes are a class of exact-repair regenerating codes for dis...

Rack-Aware Regenerating Codes with Multiple Erasure Tolerance

In a modern distributed storage system, storage nodes are organized in r...

Guessing Cost: Applications to Distributed DataStorage and Repair in Cellular Networks

The notion of guessing cost (also referred to as the cost of guessing) i...

An Efficient Piggybacking Design Framework with Sub-packetization l≤ r for All-Node Repair

Piggybacking design has been widely applied in distributed storage syste...

Near-optimal Repair of Reed-Solomon Codes with Low Sub-packetization

Minimum storage regenerating (MSR) codes are MDS codes which allow for r...

A Class of MSR Codes for Clustered Distributed Storage

Clustered distributed storage models real data centers where intra- and ...

Please sign up or login with your details

Forgot password? Click here to reset