The Crossover Process: Learnability and Data Protection from Inference Attacks

06/13/2016
by   Richard Nock, et al.
0

It is usual to consider data protection and learnability as conflicting objectives. This is not always the case: we show how to jointly control inference --- seen as the attack --- and learnability by a noise-free process that mixes training examples, the Crossover Process (cp). One key point is that the cp is typically able to alter joint distributions without touching on marginals, nor altering the sufficient statistic for the class. In other words, it saves (and sometimes improves) generalization for supervised learning, but can alter the relationship between covariates --- and therefore fool measures of nonlinear independence and causal inference into misleading ad-hoc conclusions. For example, a cp can increase / decrease odds ratios, bring fairness or break fairness, tamper with disparate impact, strengthen, weaken or reverse causal directions, change observed statistical measures of dependence. For each of these, we quantify changes brought by a cp, as well as its statistical impact on generalization abilities via a new complexity measure that we call the Rademacher cp complexity. Experiments on a dozen readily available domains validate the theory.

READ FULL TEXT
research
06/01/2017

Selective Inference for Multi-Dimensional Multiple Change Point Detection

We consider the problem of multiple change point (CP) detection from a m...
research
04/29/2015

Prefix-Projection Global Constraint for Sequential Pattern Mining

Sequential pattern mining under constraints is a challenging data mining...
research
02/02/2022

Approximating Full Conformal Prediction at Scale via Influence Functions

Conformal prediction (CP) is a wrapper around traditional machine learni...
research
08/26/2021

The cone of 5× 5 completely positive matrices

We study the cone of completely positive (cp) matrices for the first int...
research
06/08/2023

Linking Frequentist and Bayesian Change-Point Methods

We show that the two-stage minimum description length (MDL) criterion wi...
research
04/09/2018

Merging joint distributions via causal model classes with low VC dimension

If X,Y,Z denote sets of random variables, two different data sources may...
research
06/20/2023

Open Problem: Learning with Variational Objectives on Measures

The theory of statistical learning has focused on variational objectives...

Please sign up or login with your details

Forgot password? Click here to reset