Rateless Codes for Low-Latency Distributed Inference in Mobile Edge Computing
We consider a mobile edge computing scenario where users want to perform a linear inference operation Wx on local data x for some network-side matrix W. The inference is performed in a distributed fashion over multiple servers at the network edge. For this scenario, we propose a coding scheme that combines a rateless code to provide resiliency against straggling servers–hence reducing the computation latency–and an irregular-repetition code to provide spatial diversity–hence reducing the communication latency. We further derive a lower bound on the total latency–comprising computation latency, communication latency, and decoding latency. The proposed scheme performs remarkably close to the bound and yields significantly lower latency than the scheme based on maximum distance separable codes recently proposed by Zhang and Simeone.
READ FULL TEXT