Bagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samples

05/10/2021
by   D. Barreiro-Ures, et al.
0

Cross-validation is a well-known and widely used bandwidth selection method in nonparametric regression estimation. However, this technique has two remarkable drawbacks: (i) the large variability of the selected bandwidths, and (ii) the inability to provide results in a reasonable time for very large sample sizes. To overcome these problems, bagging cross-validation bandwidths are analyzed in this paper. This approach consists in computing the cross-validation bandwidths for a finite number of subsamples and then rescaling the averaged smoothing parameters to the original sample size. Under a random-design regression model, asymptotic expressions up to a second-order for the bias and variance of the leave-one-out cross-validation bandwidth for the Nadaraya–Watson estimator are obtained. Subsequently, the asymptotic bias and variance and the limit distribution are derived for the bagged cross-validation selector. Suitable choices of the number of subsamples and the subsample size lead to an n^-1/2 rate for the convergence in distribution of the bagging cross-validation selector, outperforming the rate n^-3/10 of leave-one-out cross-validation. Several simulations and an illustration on a real dataset related to the COVID-19 pandemic show the behavior of our proposal and its better performance, in terms of statistical efficiency and computing time, when compared to leave-one-out cross-validation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2017

Cross-validation

This text is a survey on cross-validation. We define all classical cross...
research
01/26/2020

On a Nadaraya-Watson Estimator with Two Bandwidths

In a regression model, we write the Nadaraya-Watson estimator of the reg...
research
03/08/2018

Nonparametric estimation of the first order Sobol indices with bootstrap bandwidth

Suppose that Y = m(X_1, ..., X_p), where (X_1, ..., X_p) are inputs, Y i...
research
10/04/2018

An Efficient Approach for Removing Look-ahead Bias in the Least Square Monte Carlo Algorithm: Leave-One-Out

The least square Monte Carlo (LSM) algorithm proposed by Longstaff and S...
research
12/10/2020

RLeave: an in silico cross-validation protocol for transcript differential expression analysis

Background and Objective: The massive parallel sequencing technology fac...

Please sign up or login with your details

Forgot password? Click here to reset