An adaptive Hessian approximated stochastic gradient MCMC method

10/03/2020
by   Yating Wang, et al.
0

Bayesian approaches have been successfully integrated into training deep neural networks. One popular family is stochastic gradient Markov chain Monte Carlo methods (SG-MCMC), which have gained increasing interest due to their scalability to handle large datasets and the ability to avoid overfitting. Although standard SG-MCMC methods have shown great performance in a variety of problems, they may be inefficient when the random variables in the target posterior densities have scale differences or are highly correlated. In this work, we present an adaptive Hessian approximated stochastic gradient MCMC method to incorporate local geometric information while sampling from the posterior. The idea is to apply stochastic approximation to sequentially update a preconditioning matrix at each iteration. The preconditioner possesses second-order information and can guide the random walk of a sampler efficiently. Instead of computing and saving the full Hessian of the log posterior, we use limited memory of the sample and their stochastic gradients to approximate the inverse Hessian-vector multiplication in the updating formula. Moreover, by smoothly optimizing the preconditioning matrix, our proposed algorithm can asymptotically converge to the target distribution with a controllable bias under mild conditions. To reduce the training and testing computational burden, we adopt a magnitude-based weight pruning method to enforce the sparsity of the network. Our method is user-friendly and is scalable to standard SG-MCMC updating rules by implementing an additional preconditioner. The sparse approximation of inverse Hessian alleviates storage and computational complexities for large dimensional models. The bias introduced by stochastic approximation is controllable and can be analyzed theoretically. Numerical experiments are performed on several problems.

READ FULL TEXT

page 20

page 24

research
02/10/2016

Stochastic Quasi-Newton Langevin Monte Carlo

Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods...
research
06/29/2020

Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications

In this work, we propose a Bayesian type sparse deep learning algorithm....
research
02/07/2020

Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection

Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have rece...
research
06/15/2015

A Complete Recipe for Stochastic Gradient MCMC

Many recent Markov chain Monte Carlo (MCMC) samplers leverage continuous...
research
01/14/2019

Posterior inference unchained with EL_2O

Statistical inference of analytically non-tractable posteriors is a diff...
research
10/23/2019

An Adaptive Empirical Bayesian Method for Sparse Deep Learning

We propose a novel adaptive empirical Bayesian (AEB) method for sparse d...
research
04/13/2018

Infinite dimensional adaptive MCMC for Gaussian processes

Latent Gaussian processes are widely applied in many fields like, statis...

Please sign up or login with your details

Forgot password? Click here to reset