Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

by   Bingzhe Wu, et al.

Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks (DNNs). Stochastic Gradient Langevin Dynamics (SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i.e., preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w.r.t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.


page 1

page 2

page 3

page 4


Generalization Bounds for Stochastic Gradient Langevin Dynamics: A Unified View via Information Leakage Analysis

Recently, generalization bounds of the non-convex empirical risk minimiz...

Stochastic Gradient Langevin Dynamics Algorithms with Adaptive Drifts

Bayesian deep learning offers a principled way to address many issues co...

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

In this paper, we aim to understand the generalization properties of gen...

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

Replica exchange Monte Carlo (reMC), also known as parallel tempering, i...

Modelling and Quantifying Membership Information Leakage in Machine Learning

Machine learning models have been shown to be vulnerable to membership i...

Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten

As the use of machine learning (ML) models is becoming increasingly popu...

An Analysis Of Protected Health Information Leakage In Deep-Learning Based De-Identification Algorithms

The increasing complexity of algorithms for analyzing medical data, incl...

Please sign up or login with your details

Forgot password? Click here to reset