Privacy Amplification via Importance Sampling

07/05/2023
by   Dominik Fay, et al.
0

We examine the privacy-enhancing properties of subsampling a data set via importance sampling as a pre-processing step for differentially private mechanisms. This extends the established privacy amplification by subsampling result to importance sampling where each data point is weighted by the reciprocal of its selection probability. The implications for privacy of weighting each point are not obvious. On the one hand, a lower selection probability leads to a stronger privacy amplification. On the other hand, the higher the weight, the stronger the influence of the point on the output of the mechanism in the event that the point does get selected. We provide a general result that quantifies the trade-off between these two effects. We show that heterogeneous sampling probabilities can lead to both stronger privacy and better utility than uniform subsampling while retaining the subsample size. In particular, we formulate and solve the problem of privacy-optimal sampling, that is, finding the importance weights that minimize the expected subset size subject to a given privacy budget. Empirically, we evaluate the privacy, efficiency, and accuracy of importance sampling-based privacy amplification on the example of k-means clustering.

READ FULL TEXT
research
06/21/2022

An attempt to trace the birth of importance sampling

In this note, we try to trace the birth of importance sampling (IS) back...
research
03/23/2023

Relaxation-based importance sampling for structural reliability analysis

This study presents an importance sampling formulation based on adaptive...
research
05/21/2021

Privacy Amplification Via Bernoulli Sampling

Balancing privacy and accuracy is a major challenge in designing differe...
research
01/28/2023

Leveraging Importance Weights in Subset Selection

We present a subset selection algorithm designed to work with arbitrary ...
research
02/09/2023

Importance Sampling Deterministic Annealing for Clustering

A current assumption of most clustering methods is that the training dat...
research
03/27/2013

An Empirical Analysis of Likelihood-Weighting Simulation on a Large, Multiply-Connected Belief Network

We analyzed the convergence properties of likelihood- weighting algorith...
research
09/12/2018

Efficient uniform generation of random derangements with the expected distribution of cycle lengths

We show how to generate random derangements with the expected distributi...

Please sign up or login with your details

Forgot password? Click here to reset