The Unintended Consequences of Overfitting: Training Data Inference Attacks

09/05/2017
by   Samuel Yeom, et al.
0

Machine learning algorithms that are applied to sensitive data pose a distinct threat to privacy. A growing body of prior work demonstrates that models produced by these algorithms may leak specific private information in the training data to an attacker, either through their structure or their observable behavior. However, the underlying cause of this privacy risk is not well understood beyond a handful of anecdotal accounts that suggest overfitting and influence might play a role. This paper examines the effect that overfitting and influence have on the ability of an attacker to learn information about training data from machine learning models, either through training set membership inference or model inversion attacks. Using both formal and empirical analyses, we illustrate a clear relationship between these factors and the privacy risk that arises in several popular machine learning algorithms. We find that overfitting is sufficient to allow an attacker to perform membership inference, and when certain conditions on the influence of certain features are present, model inversion attacks. Interestingly, our formal analysis also shows that overfitting is not necessary for these attacks, and begins to shed light on what other factors may be in play. Finally, we explore the connection between two types of attack, membership inference and model inversion, and show that there are deep connections between the two that lead to effective new attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

Enhanced Membership Inference Attacks against Machine Learning Models

How much does a given trained model leak about each individual data reco...
research
06/29/2020

Reducing Risk of Model Inversion Using Privacy-Guided Training

Machine learning models often pose a threat to the privacy of individual...
research
10/20/2022

How Does a Deep Learning Model Architecture Impact Its Privacy?

As a booming research area in the past decade, deep learning technologie...
research
09/22/2022

Privacy Attacks Against Biometric Models with Fewer Samples: Incorporating the Output of Multiple Models

Authentication systems are vulnerable to model inversion attacks where a...
research
09/18/2022

Membership Inference Attacks and Generalization: A Causal Perspective

Membership inference (MI) attacks highlight a privacy weakness in presen...
research
06/14/2021

Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions

Backdoor attacks inject poisoning samples during training, with the goal...
research
06/29/2019

Privacy Risks of Explaining Machine Learning Models

Can we trust black-box machine learning with its decisions? Can we trust...

Please sign up or login with your details

Forgot password? Click here to reset