Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels

05/04/2017
by   Curtis G. Northcutt, et al.
0

Noisy PN learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate rho1 for positive examples and rho0 for negative examples. We propose Rank Pruning (RP) to solve noisy PN learning and the open problem of estimating the noise rates, i.e. the fraction of wrong positive and negative labels. Unlike prior solutions, RP is time-efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. We prove RP has consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. RP achieves state-of-the-art noise estimation and F1, error, and AUC-PR for both MNIST and CIFAR datasets, regardless of the amount of noise and performs similarly impressively when a large portion of training examples are noise drawn from a third distribution. To highlight, RP with a CNN classifier can predict if an MNIST digit is a "one"or "not" with only 0.25 examples are mislabeled and 50 negative examples.

READ FULL TEXT
research
10/31/2019

Confident Learning: Estimating Uncertainty in Dataset Labels

Learning exists in the context of data, yet notions of confidence typica...
research
11/25/2022

Positive unlabeled learning with tensor networks

Positive unlabeled learning is a binary classification problem with posi...
research
05/29/2019

Probabilistic Decoupling of Labels in Classification

We investigate probabilistic decoupling of labels supplied for training,...
research
04/20/2022

Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability

In this paper, we study learning in probabilistic domains where the lear...
research
08/02/2022

Binary Classification with Positive Labeling Sources

To create a large amount of training labels for machine learning models ...
research
08/14/2020

Negative Confidence-Aware Weakly Supervised Binary Classification for Effective Review Helpfulness Classification

The incompleteness of positive labels and the presence of many unlabelle...
research
12/20/2021

Learning with Label Noise for Image Retrieval by Selecting Interactions

Learning with noisy labels is an active research area for image classifi...

Please sign up or login with your details

Forgot password? Click here to reset