Unsupervised Data Imputation via Variational Inference of Deep Subspaces

by   Adrian V. Dalca, et al.

A wide range of systems exhibit high dimensional incomplete data. Accurate estimation of the missing data is often desired, and is crucial for many downstream analyses. Many state-of-the-art recovery methods involve supervised learning using datasets containing full observations. In contrast, we focus on unsupervised estimation of missing image data, where no full observations are available - a common situation in practice. Unsupervised imputation methods for images often employ a simple linear subspace to capture correlations between data dimensions, omitting more complex relationships. In this work, we introduce a general probabilistic model that describes sparse high dimensional imaging data as being generated by a deep non-linear embedding. We derive a learning algorithm using a variational approximation based on convolutional neural networks and discuss its relationship to linear imputation models, the variational auto encoder, and deep image priors. We introduce sparsity-aware network building blocks that explicitly model observed and missing data. We analyze proposed sparsity-aware network building blocks, evaluate our method on public domain imaging datasets, and conclude by showing that our method enables imputation in an important real-world problem involving medical images. The code is freely available as part of the |neuron| library at http://github.com/adalca/neuron.


page 1

page 4

page 7

page 8

page 9


Leveraging variational autoencoders for multiple data imputation

Missing data persists as a major barrier to data analysis across numerou...

Deep Generative Pattern-Set Mixture Models for Nonignorable Missingness

We propose a variational autoencoder architecture to model both ignorabl...

A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning

This paper takes a step towards temporal reasoning in a dynamically chan...

Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data

We propose a new probabilistic method for unsupervised recovery of corru...

Variational Inference for Longitudinal Data Using Normalizing Flows

This paper introduces a new latent variable generative model able to han...

Fed-MIWAE: Federated Imputation of Incomplete Data via Deep Generative Models

Federated learning allows for the training of machine learning models on...

Using Data Imputation for Signal Separation in High Contrast Imaging

To characterize circumstellar systems in high contrast imaging, the fund...

Please sign up or login with your details

Forgot password? Click here to reset