Re-identification of Individuals in Genomic Datasets Using Public Face Images

DNA sequencing is becoming increasingly commonplace, both in medical and direct-to-consumer settings. To promote discovery, collected genomic data is often de-identified and shared, either in public repositories, such as OpenSNP, or with researchers through access-controlled repositories. However, recent studies have suggested that genomic data can be effectively matched to high-resolution three-dimensional face images, which raises a concern that the increasingly ubiquitous public face images can be linked to shared genomic data, thereby re-identifying individuals in the genomic data. While these investigations illustrate the possibility of such an attack, they assume that those performing the linkage have access to extremely well-curated data. Given that this is unlikely to be the case in practice, it calls into question the pragmatic nature of the attack. As such, we systematically study this re-identification risk from two perspectives: first, we investigate how successful such linkage attacks can be when real face images are used, and second, we consider how we can empower individuals to have better control over the associated re-identification risk. We observe that the true risk of re-identification is likely substantially smaller for most individuals than prior literature suggests. In addition, we demonstrate that the addition of a small amount of carefully crafted noise to images can enable a controlled trade-off between re-identification success and the quality of shared images, with risk typically significantly lowered even with noise that is imperceptible to humans.


page 1

page 6


Style Your Face Morph and Improve Your Face Morphing Attack Detector

A morphed face image is a synthetically created image that looks so simi...

The Influence of the Other-Race Effect on Susceptibility to Face Morphing Attacks

Facial morphs created between two identities resemble both of the faces ...

Identity-Preserving Aging of Face Images via Latent Diffusion Models

The performance of automated face recognition systems is inevitably impa...

Decomposing multispectral face images into diffuse and specular shading and biophysical parameters

We propose a novel biophysical and dichromatic reflectance model that ef...

Learning A Shared Transform Model for Skull to Digital Face Image Matching

Human skull identification is an arduous task, traditionally requiring t...

Assessing Privacy Risks from Feature Vector Reconstruction Attacks

In deep neural networks for facial recognition, feature vectors are nume...

Stop the Open Data Bus, We Want to Get Off

The subject of this report is the re-identification of individuals in th...

Please sign up or login with your details

Forgot password? Click here to reset