Scalable Private Learning with PATE

02/24/2018
by   Nicolas Papernot, et al.
0

The rapid adoption of machine learning has increased concerns about the privacy implications of machine learning models trained on sensitive data, such as medical records or other personal information. To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers' answers. However, PATE has so far been evaluated only on simple classification tasks like MNIST, leaving unclear its utility when applied to larger-scale learning tasks and real-world datasets. In this work, we show how PATE can scale to learning tasks with large numbers of output classes and uncurated, imbalanced training data with errors. For this, we introduce new noisy aggregation mechanisms for teacher ensembles that are more selective and add less noise, and prove their tighter differential-privacy guarantees. Our new mechanisms build on two insights: the chance of teacher consensus is increased by using more concentrated noise and, lacking consensus, no answer need be given to a student. The consensus answers used are more likely to be correct, offer better intuitive privacy, and incur lower-differential privacy cost. Our evaluation shows our mechanisms improve on the original PATE on all measures, and scale to larger tasks with both high utility and very strong privacy (ε < 1.0).

READ FULL TEXT

page 11

page 12

research
11/03/2022

Private Semi-supervised Knowledge Transfer for Deep Learning from Noisy Labels

Deep learning models trained on large-scale data have achieved encouragi...
research
06/21/2019

Scalable Differentially Private Generative Student Model via PATE

Recent rapid development of machine learning is largely due to algorithm...
research
10/26/2021

SEDML: Securely and Efficiently Harnessing Distributed Knowledge in Machine Learning

Training high-performing deep learning models require a rich amount of d...
research
09/22/2022

In Differential Privacy, There is Truth: On Vote Leakage in Ensemble Private Learning

When learning from sensitive data, care must be taken to ensure that tra...
research
10/18/2016

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Some machine learning applications involve training data that is sensiti...
research
03/01/2020

Differentially Private Deep Learning with Smooth Sensitivity

Ensuring the privacy of sensitive data used to train modern machine lear...
research
09/18/2021

Releasing Graph Neural Networks with Differential Privacy Guarantees

With the increasing popularity of Graph Neural Networks (GNNs) in severa...

Please sign up or login with your details

Forgot password? Click here to reset