Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages

by   Zheng Wei, et al.

Due to the escalating growth of big data sets in recent years, new parallel computing methods have been developed for large scale Bayesian analysis. These methods partition large data sets by observations into subsets, perform independent Bayesian Markov chain Monte Carlo (MCMC) analysis on the subsets, and combine subset posteriors to estimate full data posteriors. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated in parallel, independently of other groups. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in simulation studies and a real data analysis that our method approximates the full data analysis closely, with greatly reduced computation times; we also detail the advantages of our method versus existing parallel MCMC computing methods.


Multilevel Monte Carlo for Scalable Bayesian Computations

Markov chain Monte Carlo (MCMC) algorithms are ubiquitous in Bayesian co...

Bayesian Logistic Regression for Small Areas with Numerous Households

We analyze binary data, available for a relatively large number (big dat...

Big Data vs. complex physical models: a scalable inference algorithm

The data torrent unleashed by current and upcoming instruments requires ...

Bayesian Functional Data Analysis over Dependent Regions and Its Application for Identification of Differentially Methylated Regions

We consider a Bayesian functional data analysis for observations measure...

Greater Than the Sum of its Parts: Computationally Flexible Bayesian Hierarchical Modeling

We propose a multistage method for making inference at all levels of a B...

Multi-Scale Process Modelling and Distributed Computation for Spatial Data

Recent years have seen a huge development in spatial modelling and predi...

Variational Bayesian hierarchical regression for data analysis

Collected data, which is used for analysis or prediction tasks, often ha...

Please sign up or login with your details

Forgot password? Click here to reset