Reference based multiple imputation – what is the right variance and how to estimate it
Reference based multiple imputation methods have become popular for handling missing data in randomised clinical trials. Rubin's variance estimator is well known to be biased compared to the reference based imputation estimator's true repeated sampling variance. Somewhat surprisingly given the increasingly popularity of these methods, there has been relatively little debate in the literature as to whether Rubin's variance estimator or alternative (smaller) variance estimators targeting the repeated sampling variance are more appropriate. We review the arguments made on both sides of this debate, and conclude that the repeated sampling variance is more appropriate. We review different approaches for estimating the frequentist variance, and suggest a recent proposal for combining bootstrapping with multiple imputation as a widely applicable general solution. At the same time, in light of the consequences of reference based assumptions for frequentist variance, we believe further scrutiny of these methods is warranted to determine whether the the strength of their assumptions are generally justifiable.
READ FULL TEXT