Scalable Differentially Private Generative Student Model via PATE

by   Yunhui Long, et al.

Recent rapid development of machine learning is largely due to algorithmic breakthroughs, computation resource development, and especially the access to a large amount of training data. However, though data sharing has the great potential of improving machine learning models and enabling new applications, there have been increasing concerns about the privacy implications of data collection. In this work, we present a novel approach for training differentially private data generator G-PATE. The generator can be used to produce synthetic datasets with strong privacy guarantee while preserving high data utility. Our approach leverages generative adversarial nets (GAN) to generate data and protect data privacy based on the Private Aggregation of Teacher Ensembles (PATE) framework. Our approach improves the use of privacy budget by only ensuring differential privacy for the generator, which is the part of the model that actually needs to be published for private data generation. To achieve this, we connect a student generator with an ensemble of teacher discriminators. We also propose a private gradient aggregation mechanism to ensure differential privacy on all the information that flows from the teacher discriminators to the student generator. We empirically show that the G-PATE significantly outperforms prior work on both image and non-image datasets.


Model Conversion via Differentially Private Data-Free Distillation

While massive valuable deep models trained on large-scale data have been...

Generating Differentially Private Datasets Using GANs

In this paper, we present a technique for generating artificial datasets...

Scalable Private Learning with PATE

The rapid adoption of machine learning has increased concerns about the ...

Differentially Private Deep Learning with Smooth Sensitivity

Ensuring the privacy of sensitive data used to train modern machine lear...

SEDML: Securely and Efficiently Harnessing Distributed Knowledge in Machine Learning

Training high-performing deep learning models require a rich amount of d...

Obfuscation of Images via Differential Privacy: From Facial Images to General Images

Due to the pervasiveness of image capturing devices in every-day life, i...

Please sign up or login with your details

Forgot password? Click here to reset