Aggregation using input-output trade-off

03/08/2018
by   Aurélie Fischer, et al.
0

In this paper, we introduce a new learning strategy based on a seminal idea of Mojirsheibani (1999, 2000, 2002a, 2002b), who proposed a smart method for combining several classifiers, relying on a consensus notion. In many aggregation methods, the prediction for a new observation x is computed by building a linear or convex combination over a collection of basic estimators r1(x),. .. , rm(x) previously calibrated using a training data set. Mojirsheibani proposes to compute the prediction associated to a new observation by combining selected outputs of the training examples. The output of a training example is selected if some kind of consensus is observed: the predictions computed for the training example with the different machines have to be "similar" to the prediction for the new observation. This approach has been recently extended to the context of regression in Biau et al. (2016). In the original scheme, the agreement condition is actually required to hold for all individual estimators, which appears inadequate if there is one bad initial estimator. In practice, a few disagreements are allowed ; for establishing the theoretical results, the proportion of estimators satisfying the condition is required to tend to 1. In this paper, we propose an alternative procedure, mixing the previous consensus ideas on the predictions with the Euclidean distance computed between entries. This may be seen as an alternative approach allowing to reduce the effect of a possibly bad estimator in the initial list, using a constraint on the inputs. We prove the consistency of our strategy in classification and in regression. We also provide some numerical experiments on simulated and real data to illustrate the benefits of this new aggregation method. On the whole, our practical study shows that our method may perform much better than the original combination technique, and, in particular, exhibit far less variance. We also show on simulated examples that this procedure mixing inputs and outputs is still robust to high dimensional inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2020

A Kernel-based Consensual Regression Aggregation Method

In this article, we introduce a kernel-based consensual aggregation meth...
research
07/25/2019

Phase Transition Unbiased Estimation in High Dimensional Settings

An important challenge in statistical analysis concerns the control of t...
research
09/20/2019

Consensual aggregation of clusters based on Bregman divergences to improve predictive models

A new procedure to construct predictive models in supervised learning pr...
research
10/06/2021

Variance function estimation in regression model via aggregation procedures

In the regression problem, we consider the problem of estimating the var...
research
09/01/2020

Performance-Agnostic Fusion of Probabilistic Classifier Outputs

We propose a method for combining probabilistic outputs of classifiers t...
research
04/06/2022

Consensual Aggregation on Random Projected High-dimensional Features for Regression

In this paper, we present a study of a kernel-based consensual aggregati...
research
08/10/2021

Decentralized Observation of Discrete-Event Systems: At Least One Can Tell

We introduce a new decentralized observation condition which we call "at...

Please sign up or login with your details

Forgot password? Click here to reset