The repair problem for Reed-Solomon codes: Optimal repair of single and multiple erasures, asymptotically optimal node size

05/04/2018
by   Itzhak Tamo, et al.
0

The repair problem in distributed storage addresses recovery of the data encoded using an erasure code, for instance, a Reed-Solomon (RS) code. We consider the problem of repairing a single node or multiple nodes in RS-coded storage systems using the smallest possible amount of inter-nodal communication. According to the cut-set bound, communication cost of repairing h> 1 failed nodes for an (n,k=n-r) MDS code using d helper nodes is at least dhl/(d+h-k), where l is the size of the node. Guruswami and Wootters (2016) initiated the study of efficient repair of RS codes, showing that they can be repaired using a smaller bandwidth than under the trivial approach. At the same time, their work as well as follow-up papers stopped short of constructing RS codes (or any scalar MDS codes) that meet the cut-set bound with equality. In this paper we construct families of RS codes that achieve the cutset bound for repair of one or several nodes. In the single-node case, we present RS codes of length n over the field F_q^l,l=((1+o(1))n n) that meet the cut-set bound. We also prove an almost matching lower bound on l, showing that super-exponential scaling is both necessary and sufficient for scalar MDS codes to achieve the cut-set bound using linear repair schemes. For the case of multiple nodes, we construct a family of RS codes that achieve the cut-set bound universally for the repair of any h=2,3,... failed nodes from any subset of d helper nodes, k< d< n-h. For a fixed number of parities r the node size of the constructed codes is close to the smallest possible node size for codes with such properties.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset