Semi-supervised Conditional Density Estimation for Imputation and Classification of Incomplete Instances

06/03/2021
by   Buliao Huang, et al.
0

Incomplete instances with various missing attributes in many real-world scenes have brought challenges to the classification task. There are some missing values imputation methods to fill the missing values with substitute values before classification. However, the separation between imputation and classification may lead to inferior performance since label information are ignored during imputation. Moreover, these imputation methods tend to initialize these missing values with strong prior assumptions, while the unreliability of such initialization is rarely considered. To tackle these problems, a novel semi-supervised conditional normalizing flow (SSCFlow) is proposed in this paper. SSCFlow explicitly utilizes the observed labels to facilitate the imputation and classification simultaneously by employing a semi-supervised algorithm to estimate the conditional probability density of missing values. Moreover, SSCFlow takes the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate the true conditional probability density of missing values. Experiments have been conducted with real-world datasets to demonstrate the robustness and efficiency of the proposed algorithm.

READ FULL TEXT
research
10/20/2020

RDIS: Random Drop Imputation with Self-Training for Incomplete Time Series Data

It is common that time-series data with missing values are encountered i...
research
03/28/2018

Semi-supervised learning for structured regression on partially observed attributed graphs

Conditional probabilistic graphical models provide a powerful framework ...
research
02/20/2023

PriSTI: A Conditional Diffusion Framework for Spatiotemporal Imputation

Spatiotemporal data mining plays an important role in air quality monito...
research
07/12/2021

Choosing Imputation Models

Imputing missing values is an important preprocessing step in data analy...
research
06/15/2022

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the on...
research
12/03/2020

Competition analysis on the over-the-counter credit default swap market

We study two questions related to competition on the OTC CDS market usin...
research
01/02/2023

Chains of Autoreplicative Random Forests for missing value imputation in high-dimensional datasets

Missing values are a common problem in data science and machine learning...

Please sign up or login with your details

Forgot password? Click here to reset