Conjugate Bayes for probit regression via unified skew-normals
Regression models for dichotomous data are ubiquitous in statistics. Besides being fundamental for inference on binary responses, such representations additionally provide an essential building-block in more complex formulations, such as predictor-dependent mixture models, deep neural networks, graphical models, and others. Within a Bayesian framework, inference typically proceeds by updating the Gaussian priors for the regression coefficients with the likelihood induced by a probit or logit model for the observed binary response data. The apparent absence of conjugacy in this Bayesian updating has motivated a wide variety of computational methods, including Markov Chain Monte Carlo (MCMC) routines and algorithms for approximating the posterior distribution. Although such methods are routinely implemented, data augmentation MCMC faces convergence and mixing issues in imbalanced data settings and in hierarchical models, whereas approximate routines fail to capture the skewness and the heavy tails typically observed in the posterior distribution of the coefficients. This article shows that the posterior for the coefficients of a probit model is indeed analytically available---under Gaussian priors---and coincides with a unified skew-normal. Due to this, it is possible to study explicitly the posterior distribution along with the predictive probability mass function of the responses, and to derive a novel and more efficient sampler for high-dimensional inference. A conjugate class of priors for Bayesian probit regression, improving flexibility in prior specification without affecting tractability in posterior inference, is also provided.
READ FULL TEXT