Approximately Counting Subgraphs in Data Streams
Estimating the number of subgraphs in data streams is a fundamental problem that has received great attention in the past decade. In this paper, we give improved streaming algorithms for approximately counting the number of occurrences of an arbitrary subgraph H, denoted # H, when the input graph G is represented as a stream of m edges. To obtain our algorithms, we provide a generic transformation that converts constant-round sublinear-time graph algorithms in the query access model to constant-pass sublinear-space graph streaming algorithms. Using this transformation, we obtain the following results. 1. We give a 3-pass turnstile streaming algorithm for (1±ϵ)-approximating # H in Õ(m^ρ(H)/ϵ^2·# H) space, where ρ(H) is the fractional edge-cover of H. This improves upon and generalizes a result of McGregor et al. [PODS 2016], who gave a 3-pass insertion-only streaming algorithm for (1±ϵ)-approximating the number # T of triangles in Õ(m^3/2/ϵ^2·# T) space if the algorithm is given additional oracle access to the degrees. 2. We provide a constant-pass streaming algorithm for (1±ϵ)-approximating # K_r in Õ(mλ^r-2/ϵ^2·# K_r) space for any r≥ 3, in a graph G with degeneracy λ, where K_r is a clique on r vertices. This resolves a conjecture by Bera and Seshadhri [PODS 2020]. More generally, our reduction relates the adaptivity of a query algorithm to the pass complexity of a corresponding streaming algorithm, and it is applicable to all algorithms in standard sublinear-time graph query models, e.g., the (augmented) general model.
READ FULL TEXT