Improved Generalized Raking Estimators to Address Dependent Covariate and Failure-Time Outcome Error

by   Eric J. Oh, et al.

Biomedical studies that use electronic health records (EHR) data for inference are often subject to bias due to measurement error. The measurement error present in EHR data is typically complex, consisting of errors of unknown functional form in covariates and the outcome, which can be dependent. To address the bias resulting from such errors, generalized raking has recently been proposed as a robust method that yields consistent estimates without the need to model the error structure. We provide rationale for why these previously proposed raking estimators can be expected to be inefficient in failure-time outcome settings involving misclassification of the event indicator. We propose raking estimators that utilize multiple imputation, to impute either the target variables or auxiliary variables, to improve the efficiency. We also consider outcome-dependent sampling designs and investigate their impact on the efficiency of the raking estimators, either with or without multiple imputation. We present an extensive numerical study to examine the performance of the proposed estimators across various measurement error settings. We then apply the proposed methods to our motivating setting, in which we seek to analyze HIV outcomes in an observational cohort with electronic health records data from the Vanderbilt Comprehensive Care Clinic.


page 1

page 2

page 3

page 4


Raking and Regression Calibration: Methods to Address Bias from Correlated Covariate and Time-to-Event Error

Medical studies that depend on electronic health records (EHR) data are ...

Three-phase generalized raking and multiple imputation estimators to address error-prone data

Validation studies are often used to obtain more reliable information in...

An Approximate Quasi-Likelihood Approach for Error-Prone Failure Time Outcomes and Exposures

Measurement error arises commonly in clinical research settings that rel...

Modeling complex measurement error in microbiome experiments

The relative abundances of species in a microbiome is a scientifically i...

Selective recruitment designs for improving observational studies using electronic health records

Large scale electronic health records (EHRs) present an opportunity to q...

Outcome identification in electronic health records using predictions from an enriched Dirichlet process mixture

We propose a novel semiparametric model for the joint distribution of a ...

Optimal Multi-Wave Validation of Secondary Use Data with Outcome and Exposure Misclassification

The growing availability of observational databases like electronic heal...

Please sign up or login with your details

Forgot password? Click here to reset