New Streaming Algorithms for High Dimensional EMD and MST

11/05/2021
by   Xi Chen, et al.
0

We study streaming algorithms for two fundamental geometric problems: computing the cost of a Minimum Spanning Tree (MST) of an n-point set X ⊂{1,2,…,Δ}^d, and computing the Earth Mover Distance (EMD) between two multi-sets A,B ⊂{1,2,…,Δ}^d of size n. We consider the turnstile model, where points can be added and removed. We give a one-pass streaming algorithm for MST and a two-pass streaming algorithm for EMD, both achieving an approximation factor of Õ(log n) and using polylog(n,d,Δ)-space only. Furthermore, our algorithm for EMD can be compressed to a single pass with a small additive error. Previously, the best known sublinear-space streaming algorithms for either problem achieved an approximation of O(min{log n , log (Δ d)}log n) [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20]. For MST, we also prove that any constant space streaming algorithm can only achieve an approximation of Ω(log n), analogous to the Ω(log n) lower bound for EMD of [Andoni-Indyk-Krauthgamer '08]. Our algorithms are based on an improved analysis of a recursive space partitioning method known generically as the Quadtree. Specifically, we show that the Quadtree achieves an Õ(log n) approximation for both EMD and MST, improving on the O(min{log n , log (Δ d)}log n) approximation of [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20].

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset