An Improved Variational Approximate Posterior for the Deep Wishart Process

05/23/2023
by   Sebastian Ober, et al.
0

Deep kernel processes are a recently introduced class of deep Bayesian models that have the flexibility of neural networks, but work entirely with Gram matrices. They operate by alternately sampling a Gram matrix from a distribution over positive semi-definite matrices, and applying a deterministic transformation. When the distribution is chosen to be Wishart, the model is called a deep Wishart process (DWP). This particular model is of interest because its prior is equivalent to a deep Gaussian process (DGP) prior, but at the same time it is invariant to rotational symmetries, leading to a simpler posterior distribution. Practical inference in the DWP was made possible in recent work ("A variational approximate posterior for the deep Wishart process" Ober and Aitchison 2021a) where the authors used a generalisation of the Bartlett decomposition of the Wishart distribution as the variational approximate posterior. However, predictive performance in that paper was less impressive than one might expect, with the DWP only beating a DGP on a few of the UCI datasets used for comparison. In this paper, we show that further generalising their distribution to allow linear combinations of rows and columns in the Bartlett decomposition results in better predictive performance, while incurring negligible additional computation cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2021

A variational approximate posterior for the deep Wishart process

Recent work introduced deep kernel processes as an entirely kernel-based...
research
11/18/2020

Understanding Variational Inference in Function-Space

Recent work has attempted to directly approximate the `function-space' o...
research
10/04/2020

Deep kernel processes

We define deep kernel processes in which positive definite Gram matrices...
research
03/15/2016

Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

We introduce a variational Bayesian neural network where the parameters ...
research
01/03/2018

A Comprehensive Bayesian Treatment of the Universal Kriging model with Matérn correlation kernels

The Gibbs reference posterior distribution provides an objective full-Ba...
research
01/03/2018

A Comprehensive Bayesian Treatment of the Universal Kriging parameters with Matérn correlation kernels

The Gibbs reference posterior distribution provides an objective full-Ba...
research
06/13/2018

Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors

Bayesian Neural Networks (BNNs) have recently received increasing attent...

Please sign up or login with your details

Forgot password? Click here to reset