Graph Unlearning

03/27/2021
by   Min Chen, et al.
23

The right to be forgotten states that a data subject has the right to erase their data from an entity storing it. In the context of machine learning (ML), it requires the ML model provider to remove the data subject's data from the training set used to build the ML model, a process known as machine unlearning. While straightforward and legitimate, retraining the ML model from scratch upon receiving unlearning requests incurs high computational overhead when the training set is large. To address this issue, a number of approximate algorithms have been proposed in the domain of image and text data, among which SISA is the state-of-the-art solution. It randomly partitions the training set into multiple shards and trains a constituent model for each shard. However, directly applying SISA to the graph data can severely damage the graph structural information, and thereby the resulting ML model utility. In this paper, we propose GraphEraser, a novel machine unlearning method tailored to graph data. Its contributions include two novel graph partition algorithms, and a learning-based aggregation method. We conduct extensive experiments on five real-world datasets to illustrate the unlearning efficiency and model utility of GraphEraser. We observe that GraphEraser achieves 2.06× (small dataset) to 35.94× (large dataset) unlearning time improvement compared to retraining from scratch. On the other hand, GraphEraser achieves up to 62.5% higher F1 score than that of random partitioning. In addition, our proposed learning-based aggregation method achieves up to 112% higher F1 score than that of the majority vote aggregation.

READ FULL TEXT

page 4

page 9

page 16

research
05/05/2020

When Machine Unlearning Jeopardizes Privacy

The right to be forgotten states that a data owner has the right to eras...
research
12/09/2019

Machine Unlearning

Once users have shared their data online, it is generally difficult for ...
research
12/10/2022

Phases, Modalities, Temporal and Spatial Locality: Domain Specific ML Prefetcher for Accelerating Graph Analytics

Graph processing applications are severely bottlenecked by memory system...
research
06/29/2021

Certifiable Machine Unlearning for Linear Models

Machine unlearning is the task of updating machine learning (ML) models ...
research
03/19/2021

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Perhaps surprisingly sewerage infrastructure is one of the most costly i...
research
05/11/2022

Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification

With the increasing prevalence of encrypted network traffic, cyber secur...
research
03/09/2023

Resolving quantitative MRI model degeneracy with machine learning via training data distribution design

Quantitative MRI (qMRI) aims to map tissue properties non-invasively via...

Please sign up or login with your details

Forgot password? Click here to reset