A Faster Sampler for Discrete Determinantal Point Processes

10/31/2022
by   Simon Barthelmé, et al.
0

Discrete Determinantal Point Processes (DPPs) have a wide array of potential applications for subsampling datasets. They are however held back in some cases by the high cost of sampling. In the worst-case scenario, the sampling cost scales as O(n^3) where n is the number of elements of the ground set. A popular workaround to this prohibitive cost is to sample DPPs defined by low-rank kernels. In such cases, the cost of standard sampling algorithms scales as O(np^2 + nm^2) where m is the (average) number of samples of the DPP (usually m ≪ n) and p (m ≤ p ≤ n) the rank of the kernel used to define the DPP. The first term, O(np^2), comes from a SVD-like step. We focus here on the second term of this cost, O(nm^2), and show that it can be brought down to O(nm + m^3 log m) without loss on the sampling's exactness. In practice, we observe extremely substantial speedups compared to the classical algorithm as soon as n > 1, 000. The algorithm described here is a close variant of the standard algorithm for sampling continuous DPPs, and uses rejection sampling. In the specific case of projection DPPs, we also show that any additional sample can be drawn in time O(m^3 log m). Finally, an interesting by-product of the analysis is that a realisation from a DPP is typically contained in a subset of size O(m log m) formed using leverage score i.i.d. sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2013

Approximate Inference in Continuous Determinantal Point Processes

Determinantal point processes (DPPs) are random point processes well-sui...
research
02/23/2018

Optimized Algorithms to Sample Determinantal Point Processes

In this technical report, we discuss several sampling algorithms for Det...
research
04/20/2020

Isotropy and Log-Concave Polynomials: Accelerated Sampling and High-Precision Counting of Matroid Bases

We define a notion of isotropy for discrete set distributions. If μ is a...
research
09/14/2021

Domain Sparsification of Discrete Distributions using Entropic Independence

We present a framework for speeding up the time it takes to sample from ...
research
10/20/2018

A Polynomial Time MCMC Method for Sampling from Continuous DPPs

We study the Gibbs sampling algorithm for continuous determinantal point...
research
03/23/2018

Determinantal Point Processes for Coresets

When one is faced with a dataset too large to be used all at once, an ob...
research
05/29/2021

Rejection sampling from shape-constrained distributions in sublinear time

We consider the task of generating exact samples from a target distribut...

Please sign up or login with your details

Forgot password? Click here to reset