Correctness of Sequential Monte Carlo Inference for Probabilistic Programming Languages
Probabilistic programming languages (PPLs) make it possible to reason under uncertainty, by encoding inference problems as programs. In order to solve these inference problems, PPLs employ many different inference algorithms. Existing research on such inference algorithms mainly concerns their implementation and efficiency, rather than the correctness of the algorithms themselves when applied in the context of expressive PPLs. To remedy this, we give a correctness proof for sequential Monte Carlo (SMC) methods in the context of an expressive PPL calculus, representative of popular PPLs such as WebPPL, Anglican, and Birch. Previous work have studied correctness of Markov chain Monte Carlo (MCMC) using an operational semantics, and in a denotational setting without term recursion. However, for SMC inference—one of the most commonly used algorithms in PPLs as of today—no formal correctness proof exists in an untyped operational setting. In particular, an open question is if the resample locations in a probabilistic program affects the correctness of SMC. We solve this fundamental problem, and make three novel contributions: (i) we prove, for the first time, that SMC inference is correct in an untyped operational context of an expressive PPL calculus, and we show that, under mild assumptions, the correctness is independent of the placement of explicit resampling points, (ii) we formalize the bootstrap particle filter for the calculus, and (iii) we demonstrate a classical law of large numbers from the SMC literature that holds as a consequence of our proof.
READ FULL TEXT