Conditional Generation of Medical Time Series for Extrapolation to Underrepresented Populations

01/20/2022
by   Simon Bing, et al.
0

The widespread adoption of electronic health records (EHRs) and subsequent increased availability of longitudinal healthcare data has led to significant advances in our understanding of health and disease with direct and immediate impact on the development of new diagnostics and therapeutic treatment options. However, access to EHRs is often restricted due to their perceived sensitive nature and associated legal concerns, and the cohorts therein typically are those seen at a specific hospital or network of hospitals and therefore not representative of the wider population of patients. Here, we present HealthGen, a new approach for the conditional generation of synthetic EHRs that maintains an accurate representation of real patient characteristics, temporal information and missingness patterns. We demonstrate experimentally that HealthGen generates synthetic cohorts that are significantly more faithful to real patient EHRs than the current state-of-the-art, and that augmenting real data sets with conditionally generated cohorts of underrepresented subpopulations of patients can significantly enhance the generalisability of models derived from these data sets to different patient populations. Synthetic conditionally generated EHRs could help increase the accessibility of longitudinal healthcare data sets and improve the generalisability of inferences made from these data sets to underrepresented populations.

READ FULL TEXT

page 9

page 11

page 37

research
12/18/2020

EVA: Generating Longitudinal Electronic Health Records Using Conditional Variational Autoencoders

Researchers require timely access to real-world longitudinal electronic ...
research
02/09/2019

Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding

Evaluating the clinical similarities between pairwise patients is a fund...
research
11/14/2019

Synthetic Event Time Series Health Data Generation

Synthetic medical data which preserves privacy while maintaining utility...
research
09/06/2021

Generation of Synthetic Electronic Health Records Using a Federated GAN

Sensitive medical data is often subject to strict usage constraints. In ...
research
06/26/2018

How to Assess the Impact of Quality and Patient Safety Interventions with Routinely Collected Longitudinal Data

Measuring the effect of patient safety improvement efforts is needed to ...
research
06/05/2020

Generation of Differentially Private Heterogeneous Electronic Health Records

Electronic Health Records (EHRs) are commonly used by the machine learni...
research
03/16/2021

Predicting Opioid Use Disorder from Longitudinal Healthcare Data using Multi-stream Transformer

Opioid Use Disorder (OUD) is a public health crisis costing the US billi...

Please sign up or login with your details

Forgot password? Click here to reset