Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation
With the proliferation of training data, distributed machine learning (DML) is becoming more competent for large-scale learning tasks. However, privacy concern has to be attached prior importance in DML, since training data may contain sensitive information of users. Most existing privacy-aware schemes are established based on an assumption that the users trust the server collecting their data, and are limited to provide the same privacy guarantee for the entire data sample. In this paper, we remove the trustworthy servers assumption, and propose a privacy-preserving ADMM-based DML framework that preserves heterogeneous privacy for users' data. The new challenging issue is to reduce the accumulation of privacy losses over ADMM iterations as much as possible. In the proposed privacy-aware DML framework, a local randomization approach, which is proved to be differentially private, is adopted to provide users with self-controlled privacy guarantee for the most sensitive information. Further, the ADMM algorithm is perturbed through a combined noise-adding method, which simultaneously preserves privacy for users' less sensitive information and strengthens the privacy protection of the most sensitive information. Also, we analyze the performance of the trained model according to its generalization error. Finally, we conduct extensive experiments using synthetic and real-world datasets to validate the theoretical results and evaluate the classification performance of the proposed framework.
READ FULL TEXT