Weak consistency of the 1-nearest neighbor measure with applications to missing data and covariate shift
When data is partially missing at random, imputation and importance weighting are often used to estimate moments of the unobserved population. In this paper, we study 1-nearest neighbor (1NN) imputation, which replaces missing data with the complete data that is the nearest neighbor in the non-missing covariate space. We define an empirical measure, the 1NN measure, and show that it is weakly consistent for the measure of the missing data. The main idea behind this result is that 1NN imputation is performing inverse probability weighting in the limit. We study applications to missing data and assessing the impact of covariate shift in prediction tasks. We conclude with a discussion of using 1NN imputation for domain adaptation in order to alleviate the impact of covariate shift.
READ FULL TEXT