Scalable Differentially Private Generative Student Model via PATE

06/21/2019
by   Yunhui Long, et al.
0

Recent rapid development of machine learning is largely due to algorithmic breakthroughs, computation resource development, and especially the access to a large amount of training data. However, though data sharing has the great potential of improving machine learning models and enabling new applications, there have been increasing concerns about the privacy implications of data collection. In this work, we present a novel approach for training differentially private data generator G-PATE. The generator can be used to produce synthetic datasets with strong privacy guarantee while preserving high data utility. Our approach leverages generative adversarial nets (GAN) to generate data and protect data privacy based on the Private Aggregation of Teacher Ensembles (PATE) framework. Our approach improves the use of privacy budget by only ensuring differential privacy for the generator, which is the part of the model that actually needs to be published for private data generation. To achieve this, we connect a student generator with an ensemble of teacher discriminators. We also propose a private gradient aggregation mechanism to ensure differential privacy on all the information that flows from the teacher discriminators to the student generator. We empirically show that the G-PATE significantly outperforms prior work on both image and non-image datasets.

READ FULL TEXT
research
04/25/2023

Model Conversion via Differentially Private Data-Free Distillation

While massive valuable deep models trained on large-scale data have been...
research
03/08/2018

Generating Differentially Private Datasets Using GANs

In this paper, we present a technique for generating artificial datasets...
research
02/24/2018

Scalable Private Learning with PATE

The rapid adoption of machine learning has increased concerns about the ...
research
04/02/2021

PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification

We propose using an adversarial autoencoder (AAE) to replace generative ...
research
03/01/2020

Differentially Private Deep Learning with Smooth Sensitivity

Ensuring the privacy of sensitive data used to train modern machine lear...
research
10/26/2021

SEDML: Securely and Efficiently Harnessing Distributed Knowledge in Machine Learning

Training high-performing deep learning models require a rich amount of d...
research
02/19/2021

Obfuscation of Images via Differential Privacy: From Facial Images to General Images

Due to the pervasiveness of image capturing devices in every-day life, i...

Please sign up or login with your details

Forgot password? Click here to reset