A Methodological Framework for the Comparative Evaluation of Multiple Imputation Methods: Multiple Imputation of Race, Ethnicity and Body Mass Index in the U.S. National COVID

06/13/2022
by   Elena Casiraghi, et al.
11

While electronic health records are a rich data source for biomedical research, these systems are not implemented uniformly across healthcare settings and significant data may be missing due to healthcare fragmentation and lack of interoperability between siloed electronic health records. Considering that the deletion of cases with missing data may introduce severe bias in the subsequent analysis, several authors prefer applying a multiple imputation strategy to recover the missing information. Unfortunately, although several literature works have documented promising results by using any of the different multiple imputation algorithms that are now freely available for research, there is no consensus on which MI algorithm works best. Beside the choice of the MI strategy, the choice of the imputation algorithm and its application settings are also both crucial and challenging. In this paper, inspired by the seminal works of Rubin and van Buuren, we propose a methodological framework that may be applied to evaluate and compare several multiple imputation techniques, with the aim to choose the most valid for computing inferences in a clinical research work. Our framework has been applied to validate, and extend on a larger cohort, the results we presented in a previous literature study, where we evaluated the influence of crucial patients' descriptors and COVID-19 severity in patients with type 2 diabetes mellitus whose data is provided by the National COVID Cohort Collaborative Enclave.

READ FULL TEXT

page 35

page 41

page 42

research
05/31/2019

Bayesian Profiling Multiple Imputation for Missing Electronic Health Records

Electronic health records (EHRs) are increasingly used for clinical and ...
research
12/02/2018

Imputation of Clinical Covariates in Time Series

Missing data is a common problem in real-world settings and particularly...
research
04/14/2020

A logic-based resampling with matching approach to multiple imputation of missing data

Researchers often use model-based multiple imputation to handle missing ...
research
06/25/2020

ELMV: a Ensemble-Learning Approach for Analyzing Electrical Health Records with Significant Missing Values

Many real-world Electronic Health Record (EHR) data contains a large pro...
research
10/19/2021

Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records

Exploration and analysis of massive datasets has recently generated incr...
research
10/22/2021

Missing the Point: Non-Convergence in Iterative Imputation Algorithms

Iterative imputation is a popular tool to accommodate missing data. Whil...
research
05/03/2022

Three-phase generalized raking and multiple imputation estimators to address error-prone data

Validation studies are often used to obtain more reliable information in...

Please sign up or login with your details

Forgot password? Click here to reset