Database Matching Under Adversarial Column Deletions

12/14/2022
by   Serhat Bakirtas, et al.
0

The de-anonymization of users from anonymized microdata through matching or aligning with publicly-available correlated databases has been of scientific interest recently. While most of the rigorous analyses of database matching have focused on random-distortion models, the adversarial-distortion models have been wanting in the relevant literature. In this work, motivated by synchronization errors in the sampling of time-indexed microdata, matching (alignment) of random databases under adversarial column deletions is investigated. It is assumed that a constrained adversary, which observes the anonymized database, can delete up to a δ fraction of the columns (attributes) to hinder matching and preserve privacy. Column histograms of the two databases are utilized as permutation-invariant features to detect the column deletion pattern chosen by the adversary. The detection of the column deletion pattern is then followed by an exact row (user) matching scheme. The worst-case analysis of this two-phase scheme yields a sufficient condition for the successful matching of the two databases, under the near-perfect recovery condition. A more detailed investigation of the error probability leads to a tight necessary condition on the database growth rate, and in turn, to a single-letter characterization of the adversarial matching capacity. This adversarial matching capacity is shown to be significantly lower than the random matching capacity, where the column deletions occur randomly. Overall, our results analytically demonstrate the privacy-wise advantages of adversarial mechanisms over random ones during the publication of anonymized time-indexed data.

READ FULL TEXT
research
02/03/2022

Database Matching Under Column Repetitions

Motivated by synchronization errors in the sampling of time-indexed data...
research
02/03/2022

Seeded Database Matching Under Noisy Column Repetitions

The re-identification or de-anonymization of users from anonymized data ...
research
05/20/2021

Database Matching Under Column Deletions

De-anonymizing user identities by matching various forms of user data av...
research
01/17/2023

Database Matching Under Noisy Synchronization Errors

The re-identification or de-anonymization of users from anonymized data ...
research
01/23/2019

A Concentration of Measure Approach to Database De-anonymization

In this paper, matching of correlated high-dimensional databases is inve...
research
07/05/2023

Gaussian Database Alignment and Gaussian Planted Matching

Database alignment is a variant of the graph alignment problem: Given a ...
research
06/23/2022

Detecting Correlated Gaussian Databases

This paper considers the problem of detecting whether two databases, eac...

Please sign up or login with your details

Forgot password? Click here to reset