Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data

09/10/2018
by   Jessa Bekker, et al.
0

Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be enabled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.

READ FULL TEXT
research
08/27/2018

Learning from Positive and Unlabeled Data under the Selected At Random Assumption

For many interesting tasks, such as medical diagnosis and web page class...
research
06/01/2011

Committee-Based Sample Selection for Probabilistic Classifiers

In many real-world learning tasks, it is expensive to acquire a sufficie...
research
03/08/2023

Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data

We address the issue of binary classification from positive and unlabele...
research
10/01/2018

Classification from Positive, Unlabeled and Biased Negative Data

Positive-unlabeled (PU) learning addresses the problem of learning a bin...
research
06/19/2020

Gradient Descent in RKHS with Importance Labeling

Labeling cost is often expensive and is a fundamental limitation of supe...
research
03/02/2021

Botcha: Detecting Malicious Non-Human Traffic in the Wild

Malicious bots make up about a quarter of all traffic on the web, and de...
research
01/29/2019

Revisiting Sample Selection Approach to Positive-Unlabeled Learning: Turning Unlabeled Data into Positive rather than Negative

In the early history of positive-unlabeled (PU) learning, the sample sel...

Please sign up or login with your details

Forgot password? Click here to reset