Data Synthesis based on Generative Adversarial Networks

by   Noseong Park, et al.

Privacy is an important concern for our society where sharing data with partners or releasing data to the public is a frequent occurrence. Some of the techniques that are being used to achieve privacy are to remove identifiers, alter quasi-identifiers, and perturb values. Unfortunately, these approaches suffer from two limitations. First, it has been shown that private information can still be leaked if attackers possess some background knowledge or other information sources. Second, they do not take into account the adverse impact these methods will have on the utility of the released data. In this paper, we propose a method that meets both requirements. Our method, called table-GAN, uses generative adversarial networks (GANs) to synthesize fake tables that are statistically similar to the original table yet do not incur information leakage. We show that the machine learning models trained using our synthetic tables exhibit performance that is similar to that of models trained using the original table for unknown testing cases. We call this property model compatibility. We believe that anonymization/perturbation/synthesis methods without model compatibility are of little value. We used four real-world datasets from four different domains for our experiments and conducted in-depth comparisons with state-of-the-art anonymization techniques. Throughout our experiments, only our method consistently shows a balance between privacy level and model compatibility.


page 1

page 2

page 3

page 4


CTAB-GAN: Effective Table Data Synthesizing

While data sharing is crucial for knowledge development, privacy concern...

Improving Model Compatibility of Generative Adversarial Networks by Boundary Calibration

Generative Adversarial Networks (GANs) is a powerful family of models th...

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

In this paper, we aim to understand the generalization properties of gen...

TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing

Generative Adversarial Networks (GAN)-synthesized table publishing lets ...

FCT-GAN: Enhancing Table Synthesis via Fourier Transform

Synthetic tabular data emerges as an alternative for sharing knowledge w...

Federated Generative Adversarial Learning

This work studies training generative adversarial networks under the fed...

Relational Data Synthesis using Generative Adversarial Networks: A Design Space Exploration

The proliferation of big data has brought an urgent demand for privacy-p...

Please sign up or login with your details

Forgot password? Click here to reset