Average Sensitivity of Spectral Clustering
Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to λ_2/λ_3^2, where λ_i is the i-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for k-way spectral clustering, which partitions the graph into k clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.
READ FULL TEXT