Testing for an ignorable sampling bias under random double truncation
In clinical and epidemiological research doubly truncated data often appear. This is the case, for instance, when the data registry is formed by interval sampling. Double truncation generally induces a sampling bias on the target variable, so proper corrections of ordinary estimation and inference procedures must be used. Unfortunately, the nonparametric maximum likelihood estimator of a doubly truncated distribution has several drawbacks, like potential non-existence and non-uniqueness issues, or large estimation variance. Interestingly, no correction for double truncation is needed when the sampling bias is ignorable, which may occur with interval sampling and other sampling designs. In such a case the ordinary empirical distribution function is a consistent and fully efficient estimator that generally brings remarkable variance improvements compared to the nonparametric maximum likelihood estimator. Thus, identification of such situations is critical for the simple and efficient estimation of the target distribution. In this paper we introduce for the first time formal testing procedures for the null hypothesis of ignorable sampling bias with doubly truncated data. The asymptotic properties of the proposed test statistic are investigated. A bootstrap algorithm to approximate the null distribution of the test in practice is introduced. The finite sample performance of the method is studied in simulated scenarios. Finally, applications to data on onset for childhood cancer and Parkinson's disease are given. Variance improvements in estimation are discussed and illustrated.
READ FULL TEXT