User-friendly introduction to PAC-Bayes bounds

by   Pierre Alquier, et al.

Aggregated predictors are obtained by making a set of basic predictors vote according to some weights, that is, to some probability distribution. Randomized predictors are obtained by sampling in a set of basic predictors, according to some prescribed probability distribution. Thus, aggregated and randomized predictors have in common that they are not defined by a minimization problem, but by a probability distribution on the set of predictors. In statistical learning theory, there is a set of tools designed to understand the generalization ability of such procedures: PAC-Bayesian or PAC-Bayes bounds. Since the original PAC-Bayes bounds of D. McAllester, these tools have been considerably improved in many directions (we will for example describe a simplified version of the localization technique of O. Catoni that was missed by the community, and later rediscovered as "mutual information bounds"). Very recently, PAC-Bayes bounds received a considerable attention: for example there was workshop on PAC-Bayes at NIPS 2017, "(Almost) 50 Shades of Bayesian Learning: PAC-Bayesian trends and insights", organized by B. Guedj, F. Bach and P. Germain. One of the reason of this recent success is the successful application of these bounds to neural networks by G. Dziugaite and D. Roy. An elementary introduction to PAC-Bayes theory is still missing. This is an attempt to provide such an introduction.


page 1

page 2

page 3

page 4


PAC-Bayes, MAC-Bayes and Conditional Mutual Information: Fast rate bounds that handle general VC classes

We give a novel, unified derivation of conditional PAC-Bayesian and mutu...

Progress in Self-Certified Neural Networks

A learning method is self-certified if it uses all available data to sim...

Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective

The lottery ticket hypothesis (LTH) has attracted attention because it c...

Robust PAC^m: Training Ensemble Models Under Model Misspecification and Outliers

Standard Bayesian learning is known to have suboptimal generalization ca...

De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors

In spite of several notable efforts, explaining the generalization of de...

A Cheat Sheet for Bayesian Prediction

This paper reviews the growing field of Bayesian prediction. Bayes point...

Local Risk Bounds for Statistical Aggregation

In the problem of aggregation, the aim is to combine a given class of ba...

Please sign up or login with your details

Forgot password? Click here to reset