Online Distributed Estimation of Principal Eigenspaces
Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further establish its rate of convergence and show how it relates to the number of nodes employed in the distributed computation, the effective rank of the data matrix under consideration, and the gap in the spectrum of the underlying population covariance matrix. The proposed algorithm is illustrated on low-rank approximation and k-means clustering tasks. The numerical results show a substantial computational speed-up vis-a-vis standard distributed PCA algorithms, without compromising learning accuracy.
READ FULL TEXT