Distribution Compression in Near-linear Time

11/15/2021
by   Abhishek Shetty, et al.
0

In distribution compression, one aims to accurately summarize a probability distribution ℙ using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling n points from a Markov chain and identifying √(n) points with 𝒪(1/√(n)) discrepancy to ℙ. Unfortunately, these algorithms suffer from quadratic or super-quadratic runtime in the sample size n. To address this deficiency, we introduce Compress++, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of 4 in error. When combined with the quadratic-time kernel halving and kernel thinning algorithms of Dwivedi and Mackey (2021), Compress++ delivers √(n) points with 𝒪(√(log n/n)) integration error and better-than-Monte-Carlo maximum mean discrepancy in 𝒪(n log^3 n) time and 𝒪( √(n)log^2 n ) space. Moreover, Compress++ enjoys the same near-linear runtime given any quadratic-time input and reduces the runtime of super-quadratic algorithms by a square-root factor. In our benchmarks with high-dimensional Monte Carlo samples and Markov chains targeting challenging differential equation posteriors, Compress++ matches or nearly matches the accuracy of its input algorithm in orders of magnitude less time.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset