Fast and Memory Efficient Differentially Private-SGD via JL Projections

02/05/2021
by   Zhiqi Bu, et al.
0

Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-JL. Our approach uses Johnson-Lindenstrauss (JL) projections to quickly approximate the per-sample gradient norms without exactly computing them, thus making the training time and memory requirements of our optimizers closer to that of their non-DP versions. Unlike previous attempts to make DP-SGD faster which work only on a subset of network architectures or use compiler techniques, we propose an algorithmic solution which works for any network in a black-box manner which is the main contribution of this paper. To illustrate this, on IMDb dataset, we train a Recurrent Neural Network (RNN) to achieve good privacy-vs-accuracy tradeoff, while being significantly faster than DP-SGD and with a similar memory footprint as non-private SGD. The privacy analysis of our algorithms is more involved than DP-SGD, we use the recently proposed f-DP framework of Dong et al. (2019) to prove privacy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2021

Differentially Private SGD with Sparse Gradients

To protect sensitive training data, differentially private stochastic gr...
research
06/05/2021

Numerical Composition of Differential Privacy

We give a fast algorithm to optimally compose privacy guarantees of diff...
research
08/03/2021

Large-Scale Differentially Private BERT

In this work, we study the large-scale pretraining of BERT-Large with di...
research
05/25/2023

DP-SGD Without Clipping: The Lipschitz Neural Network Way

State-of-the-art approaches for training Differentially Private (DP) Dee...
research
12/12/2022

Generalizing DP-SGD with Shuffling and Batching Clipping

Classical differential private DP-SGD implements individual clipping wit...
research
08/23/2023

Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in Private SGD

Differentially private SGD (DP-SGD) holds the promise of enabling the sa...
research
05/09/2022

SmoothNets: Optimizing CNN architecture design for differentially private deep learning

The arguably most widely employed algorithm to train deep neural network...

Please sign up or login with your details

Forgot password? Click here to reset