Prevalence Estimation from Random Samples and Census Data with Participation Bias
Countries officially record the number of COVID-19 cases based on medical tests of a subset of the population with unknown participation bias. For prevalence estimation, the official information is typically discarded and, instead, small random survey samples are taken. We derive (maximum likelihood and method of moment) prevalence estimators, based on a survey sample, that additionally utilize the official information, and that are substantially more accurate than the simple sample proportion of positive cases. Put differently, using our estimators, the same level of precision can be obtained with substantially smaller survey samples. We take into account the possibility of measurement errors due to the sensitivity and specificity of the medical testing procedure. The proposed estimators and associated confidence intervals are implemented in the companion open source R package cape.
READ FULL TEXT