Missing Value Imputation Based on Deep Generative Models

08/05/2018
by   Hongbao Zhang, et al.
14

Missing values widely exist in many real-world datasets, which hinders the performing of advanced data analytics. Properly filling these missing values is crucial but challenging, especially when the missing rate is high. Many approaches have been proposed for missing value imputation (MVI), but they are mostly heuristics-based, lacking a principled foundation and do not perform satisfactorily in practice. In this paper, we propose a probabilistic framework based on deep generative models for MVI. Under this framework, imputing the missing entries amounts to seeking a fixed-point solution between two conditional distributions defined on the missing entries and latent variables respectively. These distributions are parameterized by deep neural networks (DNNs) which possess high approximation power and can capture the nonlinear relationships between missing entries and the observed values. The learning of weight parameters of DNNs is performed by maximizing an approximation of the log-likelihood of observed values. We conducted extensive evaluation on 13 datasets and compared with 11 baselines methods, where our methods largely outperforms the baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2019

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, a...
research
08/16/2023

Deep Generative Imputation Model for Missing Not At Random Data

Data analysis usually suffers from the Missing Not At Random (MNAR) prob...
research
08/13/2023

Probabilistic Imputation for Time-series Classification with Missing Data

Multivariate time series data for real-world applications typically cont...
research
04/03/2021

Training Deep Normalizing Flow Models in Highly Incomplete Data Scenarios with Prior Regularization

Deep generative frameworks including GANs and normalizing flow models ha...
research
02/02/2023

Conditional expectation for missing data imputation

Missing data is common in datasets retrieved in various areas, such as m...
research
10/13/2022

Probabilistic Missing Value Imputation for Mixed Categorical and Ordered Data

Many real-world datasets contain missing entries and mixed data types in...
research
08/29/2022

A Missing Value Filling Model Based on Feature Fusion Enhanced Autoencoder

With the advent of the big data era, the data quality problem is becomin...

Please sign up or login with your details

Forgot password? Click here to reset