Density Ratio Estimation and Neyman Pearson Classification with Missing Data

02/21/2023
by   Josh Givens, et al.
0

Density Ratio Estimation (DRE) is an important machine learning technique with many downstream applications. We consider the challenge of DRE with missing not at random (MNAR) data. In this setting, we show that using standard DRE methods leads to biased results while our proposal (M-KLIEP), an adaptation of the popular DRE procedure KLIEP, restores consistency. Moreover, we provide finite sample estimation error bounds for M-KLIEP, which demonstrate minimax optimality with respect to both sample size and worst-case missingness. We then adapt an important downstream application of DRE, Neyman-Pearson (NP) classification, to this MNAR setting. Our procedure both controls Type I error and achieves high power, with high probability. Finally, we demonstrate promising empirical performance both synthetic data and real-world data with simulated missingness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/30/2023

Adaptive learning of density ratios in RKHS

Estimating the ratio of two probability densities from finitely many obs...
research
05/02/2019

Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size!

How does missing data affect our ability to learn signal structures? It ...
research
08/30/2022

Empirical and Full Bayes estimation of the type of a Pitman-Yor process

The Pitman-Yor process is a random discrete probability distribution of ...
research
04/27/2021

Propensity Score Estimation Using Density Ratio Model under Item Nonresponse

Missing data is frequently encountered in practice. Propensity score est...
research
05/28/2021

The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization

We propose a model for multiclass classification of time series to make ...
research
01/11/2018

Minimax Optimality of Sign Test for Paired Heterogeneous Data

Comparing two groups under different conditions is ubiquitous in the bio...
research
06/19/2020

Minimax rates without the fixed sample size assumption

We generalize the notion of minimax convergence rate. In contrast to the...

Please sign up or login with your details

Forgot password? Click here to reset