Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes
Variational inference is a popular approach to reason about uncertainty in Bayesian neural networks (BNNs) and deep Gaussian processes (deep GPs). However, typical variational approximate posteriors for deep BNNs and GPs use an approximate posterior that factorises across layers. This is a problematic assumption, because what matters in a deep BNN or GP is the input-output transformation defined by the full network, not the input-output transformation defined by an individual layer. We therefore propose an approximate posterior with dependencies across layers that seeks to jointly model the input-output transformation over the full network. Our approximate posterior is based on a "global" set of inducing points that are defined only at the input layer, and propagated through the network. In showing that BNNs are a special case of deep GPs, we demonstrate that this approximate posterior can be used to infer both the weights of a BNN and the functions in a deep GP. Further, we consider a new correlated prior over the weights of a BNN, which in combination with global inducing points gives state-of-the-art performance for a variational Bayesian method, without data augmentation or posterior tempering, on CIFAR-10 of 86.7%.
READ FULL TEXT