Communication-Optimal Distributed Dynamic Graph Clustering

11/14/2018
by   Chun Jiang Zhu, et al.
0

We consider the problem of clustering graph nodes over large-scale dynamic graphs, such as citation networks, images and web networks, when graph updates such as node/edge insertions/deletions are observed distributively. We propose communication-efficient algorithms for two well-established communication models namely the message passing and the blackboard models. Given a graph with n nodes that is observed at s remote sites over time [1,t], the two proposed algorithms have communication costs Õ(ns) and Õ(n+s) (Õ hides a polylogarithmic factor), almost matching their lower bounds, Ω(ns) and Ω(n+s), respectively, in the message passing and the blackboard models. More importantly, we prove that at each time point in [1,t] our algorithms generate clustering quality nearly as good as that of centralizing all updates up to that time and then applying a standard centralized clustering algorithm. We conducted extensive experiments on both synthetic and real-life datasets which confirmed the communication efficiency of our approach over baseline algorithms while achieving comparable clustering results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset