Learning to Rank with Missing Data via Generative Adversarial Networks

11/04/2020
by   Grace Deng, et al.
0

We explore the role of Conditional Generative Adversarial Networks (GAN) in imputing missing data and apply GAN imputation on a novel use case in e-commerce: a learning-to-rank problem with incomplete training data. Conventional imputation methods often make assumptions regarding the underlying distribution of the missing data, while GANs offer an alternative framework to sidestep approximating intractable distributions. First, we prove that GAN imputation offers theoretical guarantees beyond the naive Missing Completely At Random (MCAR) scenario. Next, we show that empirically, the Conditional GAN structure is well suited for data with heterogeneous distributions and across unbalanced classes, improving performance metrics such as RMSE. Using an Amazon Search ranking dataset, we produce standard ranking models trained on GAN-imputed data that are comparable to training on ground-truth data based on standard ranking quality metrics NDCG and MRR. We also highlight how different neural net features such as convolution and dropout layers can improve performance given different missing value settings.

READ FULL TEXT
research
02/25/2019

MisGAN: Learning from Incomplete Data with Generative Adversarial Networks

Generative adversarial networks (GANs) have been shown to provide an eff...
research
12/23/2020

IFGAN: Missing Value Imputation using Feature-specific Generative Adversarial Networks

Missing value imputation is a challenging and well-researched topic in d...
research
08/11/2020

IGANI: Iterative Generative Adversarial Networks for Imputation Applied to Prediction of Traffic Data

Generative adversarial networks (GANs) are implicit generative models th...
research
08/03/2021

Categorical EHR Imputation with Generative Adversarial Nets

Electronic Health Records often suffer from missing data, which poses a ...
research
10/06/2022

Comparison of Missing Data Imputation Methods using the Framingham Heart study dataset

Cardiovascular disease (CVD) is a class of diseases that involve the hea...
research
08/22/2017

VIGAN: Missing View Imputation with Generative Adversarial Networks

In an era when big data are becoming the norm, there is less concern wit...
research
01/26/2022

Generative Trees: Adversarial and Copycat

While Generative Adversarial Networks (GANs) achieve spectacular results...

Please sign up or login with your details

Forgot password? Click here to reset