Benchmarking Differentially Private Synthetic Data Generation Algorithms

12/16/2021
by   Yuchao Tao, et al.
0

This work presents a systematic benchmark of differentially private synthetic data generation algorithms that can generate tabular data. Utility of the synthetic data is evaluated by measuring whether the synthetic data preserve the distribution of individual and pairs of attributes, pairwise correlation as well as on the accuracy of an ML classification model. In a comprehensive empirical evaluation we identify the top performing algorithms and those that consistently fail to beat baseline approaches.

READ FULL TEXT

page 3

page 4

research
07/01/2023

When Synthetic Data Met Regulation

In this paper, we argue that synthetic data produced by Differentially P...
research
06/13/2023

Continual Release of Differentially Private Synthetic Data

Motivated by privacy concerns in long-term longitudinal studies in medic...
research
05/24/2023

Post-processing Private Synthetic Data for Improving Utility on Selected Measures

Existing private synthetic data generation algorithms are agnostic to do...
research
10/13/2022

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Legal and ethical restrictions on accessing relevant data inhibit data s...
research
09/15/2022

Private Synthetic Data for Multitask Learning and Marginal Queries

We provide a differentially private algorithm for producing synthetic da...
research
01/29/2022

AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data

We propose AIM, a novel algorithm for differentially private synthetic d...
research
07/12/2022

dpart: Differentially Private Autoregressive Tabular, a General Framework for Synthetic Data Generation

We propose a general, flexible, and scalable framework dpart, an open so...

Please sign up or login with your details

Forgot password? Click here to reset