Medical data wrangling with sequential variational autoencoders

03/12/2021
by   Daniel Barrejón, et al.
0

Medical data sets are usually corrupted by noise and missing data. These missing patterns are commonly assumed to be completely random, but in medical scenarios, the reality is that these patterns occur in bursts due to sensors that are off for some time or data collected in a misaligned uneven fashion, among other causes. This paper proposes to model medical data records with heterogeneous data types and bursty missing data using sequential variational autoencoders (VAEs). In particular, we propose a new methodology, the Shi-VAE, which extends the capabilities of VAEs to sequential streams of data with missing observations. We compare our model against state-of-the-art solutions in an intensive care unit database (ICU) and a dataset of passive human monitoring. Furthermore, we find that standard error metrics such as RMSE are not conclusive enough to assess temporal models and include in our analysis the cross-correlation between the ground truth and the imputed signal. We show that Shi-VAE achieves the best performance in terms of using both metrics, with lower computational complexity than the GP-VAE model, which is the state-of-the-art method for medical records.

READ FULL TEXT
research
06/09/2020

VAEs in the Presence of Missing Data

Real world datasets often contain entries with missing elements e.g. in ...
research
01/18/2021

Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders

Electronic Health Records (EHRs) are commonly used to investigate relati...
research
10/31/2016

Temporal Matrix Completion with Locally Linear Latent Factors for Medical Applications

Regular medical records are useful for medical practitioners to analyze ...
research
07/10/2018

Handling Incomplete Heterogeneous Data using VAEs

Variational autoencoders (VAEs), as well as other generative models, hav...
research
05/23/2019

Robust Variational Autoencoder

Machine learning methods often need a large amount of labeled training d...
research
02/24/2023

Imputing Knowledge Tracing Data with Subject-Based Training via LSTM Variational Autoencoders Frameworks

The issue of missing data poses a great challenge on boosting performanc...
research
04/26/2020

Notes on Icebreaker

Icebreaker [1] is new research from MSR that is able to achieve state of...

Please sign up or login with your details

Forgot password? Click here to reset