Perfect L_p Sampling in a Data Stream

08/16/2018
by   Rajesh Jayaram, et al.
0

In this paper, we resolve the one-pass space complexity of L_p sampling for p ∈ (0,2). Given a stream of updates (insertions and deletions) to the coordinates of an underlying vector f ∈R^n, a perfect L_p sampler must output an index i with probability |f_i|^p/f_p^p, and is allowed to fail with some probability δ. So far, for p > 0 no algorithm has been shown to solve the problem exactly using poly( n)-bits of space. In 2010, Monemizadeh and Woodruff introduced an approximate L_p sampler, which outputs i with probability (1 ±ν)|f_i|^p /f_p^p, using space polynomial in ν^-1 and (n). The space complexity was later reduced by Jowhari, Sağlam, and Tardos to roughly O(ν^-p^2 n δ^-1) for p ∈ (0,2), which tightly matches the Ω(^2 n δ^-1) lower bound in terms of n and δ, but is loose in terms of ν. Given these nearly tight bounds, it is perhaps surprising that no lower bound at all exists in terms of ν---not even a bound of Ω(ν^-1) is known. In this paper, we explain this phenomenon by demonstrating the existence of an O(^2 n δ^-1)-bit perfect L_p sampler for p ∈ (0,2). This shows that ν need not factor into the space of an L_p sampler, which completely closes the complexity of the problem for this range of p. For p=2, our bound is O(^3 n δ^-1)-bits, which matches the prior best known upper bound of O(ν^-2^3n δ^-1), but has no dependence on ν. Finally, we show improved upper and lower bounds for returning a (1±ϵ) relative error estimate of the frequency f_i of the sampled index i.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2021

Truly Perfect Samplers for Data Streams and Sliding Windows

In the G-sampling problem, the goal is to output an index i of a vector ...
research
07/12/2019

Towards Optimal Moment Estimation in Streaming and Distributed Models

One of the oldest problems in the data stream model is to approximate th...
research
11/14/2022

The ℓ_p-Subspace Sketch Problem in Small Dimensions with Applications to Support Vector Machines

In the ℓ_p-subspace sketch problem, we are given an n× d matrix A with n...
research
05/11/2023

The Space Complexity of Consensus from Swap

Nearly thirty years ago, it was shown that Ω(√(n)) registers are needed ...
research
10/19/2012

Practically Perfect

The property of perfectness plays an important role in the theory of Bay...
research
05/08/2021

Separations for Estimating Large Frequency Moments on Data Streams

We study the classical problem of moment estimation of an underlying vec...
research
05/28/2019

Complexity lower bounds for computing the approximately-commuting operator value of non-local games to high precision

We study the problem of approximating the commuting-operator value of a ...

Please sign up or login with your details

Forgot password? Click here to reset