Measuring the accuracy of likelihood-free inference
Complex scientific models where the likelihood cannot be evaluated present a challenge for statistical inference. Over the past two decades, a wide range of algorithms have been proposed for learning parameters in computationally feasible ways, often under the heading of approximate Bayesian computation or likelihood-free inference. There is, however, no consensus on how to rigorously evaluate the performance of these algorithms. Here, we argue for scoring algorithms by the mean squared error in estimating expectations of functions with respect to the posterior. We show that score implies common alternatives, including the acceptance rate and effective sample size, as limiting special cases. We then derive asymptotically optimal distributions for choosing or sampling discrete or continuous simulation parameters, respectively. Our recommendations differ significantly from guidelines based on alternative scores outside of their region of validity. As an application, we show sequential Monte Carlo in this context can be made more accurate with no new samples by accepting particles from all rounds.
READ FULL TEXT