Separations for Estimating Large Frequency Moments on Data Streams

05/08/2021
by   David P. Woodruff, et al.
0

We study the classical problem of moment estimation of an underlying vector whose n coordinates are implicitly defined through a series of updates in a data stream. We show that if the updates to the vector arrive in the random-order insertion-only model, then there exist space efficient algorithms with improved dependencies on the approximation parameter ε. In particular, for any real p > 2, we first obtain an algorithm for F_p moment estimation using 𝒪̃(1/ε^4/p· n^1-2/p) bits of memory. Our techniques also give algorithms for F_p moment estimation with p>2 on arbitrary order insertion-only and turnstile streams, using 𝒪̃(1/ε^4/p· n^1-2/p) bits of space and two passes, which is the first optimal multi-pass F_p estimation algorithm up to log n factors. Finally, we give an improved lower bound of Ω(1/ε^2· n^1-2/p) for one-pass insertion-only streams. Our results separate the complexity of this problem both between random and non-random orders, as well as one-pass and multi-pass streams.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2018

Revisiting Frequency Moment Estimation in Random Order Streams

We revisit one of the classic problems in the data stream literature, na...
research
07/12/2019

Towards Optimal Moment Estimation in Streaming and Distributed Models

One of the oldest problems in the data stream model is to approximate th...
research
05/24/2021

A Simple Proof of a New Set Disjointness with Applications to Data Streams

The multiplayer promise set disjointness is one of the most widely used ...
research
11/09/2022

Tight Bounds for Vertex Connectivity in Dynamic Streams

We present a streaming algorithm for the vertex connectivity problem in ...
research
08/16/2018

Perfect L_p Sampling in a Data Stream

In this paper, we resolve the one-pass space complexity of L_p sampling ...
research
08/26/2021

Truly Perfect Samplers for Data Streams and Sliding Windows

In the G-sampling problem, the goal is to output an index i of a vector ...
research
02/17/2008

Compressed Counting

Counting is among the most fundamental operations in computing. For exam...

Please sign up or login with your details

Forgot password? Click here to reset