Post-processing Private Synthetic Data for Improving Utility on Selected Measures

05/24/2023
by   Hao Wang, et al.
0

Existing private synthetic data generation algorithms are agnostic to downstream tasks. However, end users may have specific requirements that the synthetic data must satisfy. Failure to meet these requirements could significantly reduce the utility of the data for downstream use. We introduce a post-processing technique that improves the utility of the synthetic data with respect to measures selected by the end user, while preserving strong privacy guarantees and dataset quality. Our technique involves resampling from the synthetic data to filter out samples that do not meet the selected utility measures, using an efficient stochastic first-order algorithm to find optimal resampling weights. Through comprehensive numerical experiments, we demonstrate that our approach consistently improves the utility of synthetic data across multiple benchmark datasets and state-of-the-art synthetic data generation algorithms.

READ FULL TEXT

page 8

page 16

page 17

page 18

research
12/16/2021

Benchmarking Differentially Private Synthetic Data Generation Algorithms

This work presents a systematic benchmark of differentially private synt...
research
07/16/2023

MargCTGAN: A "Marginally” Better CTGAN for the Low Sample Regime

The potential of realistic and useful synthetic data is significant. How...
research
05/17/2023

Utility Theory of Synthetic Data Generation

Evaluating the utility of synthetic data is critical for measuring the e...
research
09/10/2023

A supervised generative optimization approach for tabular data

Synthetic data generation has emerged as a crucial topic for financial i...
research
05/30/2022

Dataset Condensation via Efficient Synthetic-Data Parameterization

The great success of machine learning with massive amounts of data comes...
research
09/26/2021

Assessing, visualizing and improving the utility of synthetic data

The synthpop package for R https://www.synthpop.org.uk provides tools to...
research
10/02/2010

A Microwave Imaging and Enhancement Technique from Noisy Synthetic Data

An inverse iterative algorithm for microwave imaging based on moment met...

Please sign up or login with your details

Forgot password? Click here to reset