Differentially Private k-Means Clustering with Guaranteed Convergence

02/03/2020
by   Zhigang Lu, et al.
0

Iterative clustering algorithms help us to learn the insights behind the data. Unfortunately, this may allow adversaries to infer the privacy of individuals with some background knowledge. In the worst case, the adversaries know the centroids of an arbitrary iteration and the information of n-1 out of n items. To protect individual privacy against such an inference attack, preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied in the interactive settings. However, existing interactive differentially private clustering algorithms suffer from a non-convergence problem, i.e., these algorithms may not terminate without a predefined number of iterations. This problem severely impacts the clustering quality and the efficiency of a differentially private algorithm. To resolve this problem, in this paper, we propose a novel differentially private clustering framework in the interactive settings which controls the orientation of the movement of the centroids over the iterations to ensure the convergence by injecting DP noise in a selected area. We prove that, in the expected case, algorithm under our framework converges in at most twice the iterations of Lloyd's algorithm. We perform experimental evaluations on real-world datasets to show that our algorithm outperforms the state-of-the-art of the interactive differentially private clustering algorithms with guaranteed convergence and better clustering quality to meet the same DP requirement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/07/2023

k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

In today's data-driven world, the sensitivity of information has been a ...
research
07/18/2022

Concurrent Composition Theorems for Differential Privacy

We study the concurrent composition properties of interactive differenti...
research
10/03/2020

Utility-efficient Differentially Private K-means Clustering based on Cluster Merging

Differential privacy is widely used in data analysis. State-of-the-art k...
research
05/23/2016

DP-EM: Differentially Private Expectation Maximization

The iterative nature of the expectation maximization (EM) algorithm pres...
research
12/05/2018

Differentially Private User-based Collaborative Filtering Recommendation Based on K-means Clustering

Collaborative filtering (CF) recommendation algorithms are well-known fo...
research
01/07/2020

Protect Edge Privacy in Path Publishing with Differential Privacy

Paths in a given network are a generalised form of time-serial chains in...
research
09/04/2019

Differentially Private SQL with Bounded User Contribution

Differential privacy (DP) provides formal guarantees that the output of ...

Please sign up or login with your details

Forgot password? Click here to reset