A Unifying Privacy Analysis Framework for Unknown Domain Algorithms in Differential Privacy
There are many existing differentially private algorithms for releasing histograms, i.e. counts with corresponding labels, in various settings. Our focus in this survey is to revisit some of the existing differentially private algorithms for releasing histograms over unknown domains, i.e. the labels of the counts that are to be released are not known beforehand. The main practical advantage of releasing histograms over an unknown domain is that the algorithm does not need to fill in missing labels because they are not present in the original histogram but in a hypothetical neighboring dataset could appear in the histogram. However, the challenge in designing differentially private algorithms for releasing histograms over an unknown domain is that some outcomes can clearly show which input was used, clearly violating privacy. The goal then is to show that the differentiating outcomes occur with very low probability. We present a unified framework for the privacy analyses of several existing algorithms. Furthermore, our analysis uses approximate concentrated differential privacy from Bun and Steinke'16, which can improve the privacy loss parameters rather than using differential privacy directly, especially when composing many of these algorithms together in an overall system.
READ FULL TEXT