Test Set Selection using Active Information Acquisition for Predictive Models
In this paper, we consider active information acquisition when the prediction model is meant to be applied on a targeted subset of the population. The goal is to label a pre-specified fraction of customers in the target or test set by iteratively querying for information from the non-target or training set. The number of queries is limited by an overall budget. Arising in the context of two rather disparate applications- banking and medical diagnosis, we pose the active information acquisition problem as a constrained optimization problem. We propose two greedy iterative algorithms for solving the above problem. We conduct experiments with synthetic data and compare results of our proposed algorithms with few other baseline approaches. The experimental results show that our proposed approaches perform better than the baseline schemes.
READ FULL TEXT