LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations

by   Brian L Trippe, et al.

Due to the ease of modern data collection, applied statisticians often have access to a large set of covariates that they wish to relate to some observed outcome. Generalized linear models (GLMs) offer a particularly interpretable framework for such an analysis. In these high-dimensional problems, the number of covariates is often large relative to the number of observations, so we face non-trivial inferential uncertainty; a Bayesian approach allows coherent quantification of this uncertainty. Unfortunately, existing methods for Bayesian inference in GLMs require running times roughly cubic in parameter dimension, and so are limited to settings with at most tens of thousand parameters. We propose to reduce time and memory costs with a low-rank approximation of the data in an approach we call LR-GLM. When used with the Laplace approximation or Markov chain Monte Carlo, LR-GLM provides a full Bayesian posterior approximation and admits running times reduced by a full factor of the parameter dimension. We rigorously establish the quality of our approximation and show how the choice of rank allows a tunable computational-statistical trade-off. Experiments support our theory and demonstrate the efficacy of LR-GLM on real large-scale datasets.


Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach

Bayesian inference typically requires the computation of an approximatio...

High-dimensional Nonlinear Bayesian Inference of Poroelastic Fields from Pressure Data

We investigate solution methods for large-scale inverse problems governe...

Sparse Approximate Cross-Validation for High-Dimensional GLMs

Leave-one-out cross validation (LOOCV) can be particularly accurate amon...

Data-dependent compression of random features for large-scale kernel approximation

Kernel methods offer the flexibility to learn complex relationships in m...

Bayesian Inference using the Proximal Mapping: Uncertainty Quantification under Varying Dimensionality

In statistical applications, it is common to encounter parameters suppor...

Bayesian Cox Regression for Population-scale Inference in Electronic Health Records

The Cox model is an indispensable tool for time-to-event analysis, parti...

Deep Learning Aided Laplace Based Bayesian Inference for Epidemiological Systems

Parameter estimation and associated uncertainty quantification is an imp...

Please sign up or login with your details

Forgot password? Click here to reset