Coded Matrix Multiplication on a Group-Based Model
Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the clustered or grouped structure of real-world computing servers. Neither the large variations in the computing power and bandwidth capabilities across different servers have been properly modeled. We suggest a group-based model to reflect practical conditions and develop an appropriate coding scheme for this model. The suggested code, called group code, employs parallel encoding for each group. We show that the suggested coding scheme can asymptotically achieve optimal computing time in regimes of infinite n, the number of workers. While theoretical analysis is conducted in the asymptotic regime, numerical results also show that the suggested scheme achieves near-optimal computing time for any finite but reasonably large n. Moreover, we demonstrate that the decoding complexity of the suggested scheme is significantly reduced by the virtue of parallel decoding.
READ FULL TEXT