Loss function based second-order Jensen inequality and its application to particle variational inference

by   Futoshi Futami, et al.

Bayesian model averaging, obtained as the expectation of a likelihood function by a posterior distribution, has been widely used for prediction, evaluation of uncertainty, and model selection. Various approaches have been developed to efficiently capture the information in the posterior distribution; one such approach is the optimization of a set of models simultaneously with interaction to ensure the diversity of the individual models in the same way as ensemble learning. A representative approach is particle variational inference (PVI), which uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. However, despite its promising performance, a theoretical understanding of this repulsion and its association with the generalization ability remains unclear. In this paper, we tackle this problem in light of PAC-Bayesian analysis. First, we provide a new second-order Jensen inequality, which has the repulsion term based on the loss function. Thanks to the repulsion term, it is tighter than the standard Jensen inequality. Then, we derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models. Finally, we derive a new PVI that optimizes the generalization error bound directly. Numerical experiments demonstrate that the performance of the proposed PVI compares favorably with existing methods in the experiment.


page 1

page 2

page 3

page 4


Excess risk analysis for epistemic uncertainty with application to variational inference

We analyze the epistemic uncertainty (EU) of supervised learning in Baye...

Learning under Model Misspecification: Applications to Variational and Ensemble methods

This paper provides a novel theoretical analysis of the problem of learn...

Flat Seeking Bayesian Neural Networks

Bayesian Neural Networks (BNNs) offer a probabilistic interpretation for...

Accelerating Stochastic Probabilistic Inference

Recently, Stochastic Variational Inference (SVI) has been increasingly a...

Latent Time Neural Ordinary Differential Equations

Neural ordinary differential equations (NODE) have been proposed as a co...

Rethinking Fano's Inequality in Ensemble Learning

We propose a fundamental theory on ensemble learning that evaluates a gi...

Function Space Particle Optimization for Bayesian Neural Networks

While Bayesian neural networks (BNNs) have drawn increasing attention, t...

Please sign up or login with your details

Forgot password? Click here to reset