Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise

02/14/2018
by   Dan Hendrycks, et al.
0

The growing importance of massive datasets with the advent of deep learning makes robustness to label noise a critical property for classifiers to have. Sources of label noise include automatic labeling for large datasets, non-expert labeling, and label corruption by data poisoning adversaries. In the latter case, corruptions may be arbitrarily bad, even so bad that a classifier predicts the wrong labels with high confidence. To protect against such sources of noise, we leverage the fact that a small set of clean labels is often easy to procure. We demonstrate that robustness to label noise up to severe strengths can be achieved by using a set of trusted data with clean labels, and propose a loss correction that utilizes trusted examples in a data-efficient manner to mitigate the effects of label noise on deep neural network classifiers. Across vision and natural language processing tasks, we experiment with various label noises at several strengths, and show that our method significantly outperforms existing methods.

READ FULL TEXT
research
10/01/2019

IEG: Robust Neural Network Training to Tackle Severe Label Noise

Collecting large-scale data with clean labels for supervised training of...
research
07/13/2020

TrustNet: Learning from Trusted Data Against (A)symmetric Label Noise

Robustness to label noise is a critical property for weakly-supervised c...
research
03/08/2022

Trustable Co-label Learning from Multiple Noisy Annotators

Supervised deep learning depends on massive accurately annotated example...
research
03/19/2019

Probabilistic End-to-end Noise Correction for Learning with Noisy Labels

Deep learning has achieved excellent performance in various computer vis...
research
04/19/2021

Do We Really Need Gold Samples for Sample Weighting Under Label Noise?

Learning with labels noise has gained significant traction recently due ...
research
08/06/2020

Salvage Reusable Samples from Noisy Data for Robust Learning

Due to the existence of label noise in web images and the high memorizat...
research
10/15/2021

Learning with Noisy Labels by Targeted Relabeling

Crowdsourcing platforms are often used to collect datasets for training ...

Please sign up or login with your details

Forgot password? Click here to reset