On the Current State of Research in Explaining Ensemble Performance Using Margins

06/07/2019
by   Waldyn Martinez, et al.
0

Empirical evidence shows that ensembles, such as bagging, boosting, random and rotation forests, generally perform better in terms of their generalization error than individual classifiers. To explain this performance, Schapire et al. (1998) developed an upper bound on the generalization error of an ensemble based on the margins of the training data, from which it was concluded that larger margins should lead to lower generalization error, everything else being equal. Many other researchers have backed this assumption and presented tighter bounds on the generalization error based on either the margins or functions of the margins. For instance, Shen and Li (2010) provide evidence suggesting that the generalization error of a voting classifier might be reduced by increasing the mean and decreasing the variance of the margins. In this article we propose several techniques and empirically test whether the current state of research in explaining ensemble performance holds. We evaluate the proposed methods through experiments with real and simulated data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2019

On the Insufficiency of the Large Margins Theory in Explaining the Performance of Ensemble Methods

Boosting and other ensemble methods combine a large number of weak class...
research
06/01/2020

Random Hyperboxes

This paper proposes a simple yet powerful ensemble classifier, called Ra...
research
10/22/2020

In Search of Robust Measures of Generalization

One of the principal scientific challenges in deep learning is explainin...
research
11/10/2020

Margins are Insufficient for Explaining Gradient Boosting

Boosting is one of the most successful ideas in machine learning, achiev...
research
11/15/2017

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

For many applications, an ensemble of base classifiers is an effective s...
research
02/03/2022

A Note on "Assessing Generalization of SGD via Disagreement"

Jiang et al. (2021) give empirical evidence that the average test error ...
research
03/04/2013

A Sharp Bound on the Computation-Accuracy Tradeoff for Majority Voting Ensembles

When random forests are used for binary classification, an ensemble of t...

Please sign up or login with your details

Forgot password? Click here to reset