DeepAI AI Chat
Log In Sign Up

Understanding the bias-variance tradeoff of Bregman divergences

02/08/2022
by   Ben Adlam, et al.
0

This paper builds upon the work of Pfau (2013), which generalized the bias variance tradeoff to any Bregman divergence loss function. Pfau (2013) showed that for Bregman divergences, the bias and variances are defined with respect to a central label, defined as the mean of the label variable, and a central prediction, of a more complex form. We show that, similarly to the label, the central prediction can be interpreted as the mean of a random variable, where the mean operates in a dual space defined by the loss function itself. Viewing the bias-variance tradeoff through operations taken in dual space, we subsequently derive several results of interest. In particular, (a) the variance terms satisfy a generalized law of total variance; (b) if a source of randomness cannot be controlled, its contribution to the bias and variance has a closed form; (c) there exist natural ensembling operations in the label and prediction spaces which reduce the variance and do not affect the bias.

READ FULL TEXT

page 1

page 2

page 3

page 4

12/17/2019

On the Bias-Variance Tradeoff: Textbooks Need an Update

The main goal of this thesis is to point out that the bias-variance trad...
01/05/2021

A unifying approach on bias and variance analysis for classification

Standard bias and variance (B V) terminologies were originally defined...
06/21/2022

Ensembling over Classifiers: a Bias-Variance Perspective

Ensembles are a straightforward, remarkably effective method for improvi...
10/19/2018

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

We revisit the bias-variance tradeoff for neural networks in light of mo...
02/23/2020

Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance

The study of model bias and variance with respect to decision boundaries...
11/04/2020

Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition

Classical learning theory suggests that the optimal generalization perfo...