A Simple Proof of a New Set Disjointness with Applications to Data Streams

05/24/2021
by   Akshay Kamath, et al.
0

The multiplayer promise set disjointness is one of the most widely used problems from communication complexity in applications. In this problem there are k players with subsets S^1, …, S^k, each drawn from {1, 2, …, n}, and we are promised that either the sets are (1) pairwise disjoint, or (2) there is a unique element j occurring in all the sets, which are otherwise pairwise disjoint. The total communication of solving this problem with constant probability in the blackboard model is Ω(n/k). We observe for most applications, it instead suffices to look at what we call the “mostly” set disjointness problem, which changes case (2) to say there is a unique element j occurring in at least half of the sets, and the sets are otherwise disjoint. This change gives us a much simpler proof of an Ω(n/k) randomized total communication lower bound, avoiding Hellinger distance and Poincare inequalities. Using this we show several new results for data streams: * for ℓ_2-Heavy Hitters, any O(1)-pass streaming algorithm in the insertion-only model for detecting if an -ℓ_2-heavy hitter exists requires min(1/^2log^2n/δ, 1/n^1/2) bits of memory, which is optimal up to a log n factor. For deterministic algorithms and constant , this gives an Ω(n^1/2) lower bound, improving the prior Ω(log n) lower bound. We also obtain lower bounds for Zipfian distributions. * for ℓ_p-Estimation, p > 2, we show an O(1)-pass Ω(n^1-2/plog(1/δ)) bit lower bound for outputting an O(1)-approximation with probability 1-δ, in the insertion-only model. This is optimal, and the best previous lower bound was Ω(n^1-2/p + log(1/δ)).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset