Towards Understanding the Impact of Model Size on Differential Private Classification

by   Yinchen Shen, et al.

Differential privacy (DP) is an essential technique for privacy-preserving. It was found that a large model trained for privacy preserving performs worse than a smaller model (e.g. ResNet50 performs worse than ResNet18). To better understand this phenomenon, we study high dimensional DP learning from the viewpoint of generalization. Theoretically, we show that for the simple Gaussian model with even small DP noise, if the dimension is large enough, then the classification error can be as bad as the random guessing. Then we propose a feature selection method to reduce the size of the model, based on a new metric which trades off the classification accuracy and privacy preserving. Experiments on real data support our theoretical results and demonstrate the advantage of the proposed method.


page 1

page 2

page 3

page 4


Rejoinder: Gaussian Differential Privacy

In this rejoinder, we aim to address two broad issues that cover most co...

Differential Privacy Has Disparate Impact on Model Accuracy

Differential privacy (DP) is a popular mechanism for training machine le...

P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification

Recently, deep convolutional neural networks (CNNs) have achieved great ...

Community Preserved Social Graph Publishing with Node Differential Privacy

The goal of privacy-preserving social graph publishing is to protect ind...

DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation

Recent success of deep neural networks (DNNs) hinges on the availability...

Please sign up or login with your details

Forgot password? Click here to reset