Impala: Low-Latency, Communication-Efficient Private Deep Learning Inference
This paper proposes Impala, a new cryptographic protocol for private inference in the client-cloud setting. Impala builds upon recent solutions that combine the complementary strengths of homomorphic encryption (HE) and secure multi-party computation (MPC). A series of protocol optimizations are developed to reduce both communication and performance bottlenecks. First, we remove MPC's overwhelmingly high communication cost from the client by introducing a proxy server and developing a low-overhead key switching technique. Key switching reduces the clients bandwidth by multiple orders of magnitude, however the communication between the proxy and cloud is still excessive. Second, to we develop an optimized garbled circuit that leverages truncated secret shares for faster evaluation and less proxy-cloud communication. Finally, we propose sparse HE convolution to reduce the computational bottleneck of using HE. Compared to the state-of-the-art, these optimizations provide a bandwidth savings of over 3X and speedup of 4X for private deep learning inference.
READ FULL TEXT