NGC: A Unified Framework for Learning with Open-World Noisy Data

by   Zhi-Fan Wu, et al.

The existence of noisy data is prevalent in both the training and testing phases of machine learning systems, which inevitably leads to the degradation of model performance. There have been plenty of works concentrated on learning with in-distribution (IND) noisy labels in the last decade, i.e., some training samples are assigned incorrect labels that do not correspond to their true classes. Nonetheless, in real application scenarios, it is necessary to consider the influence of out-of-distribution (OOD) samples, i.e., samples that do not belong to any known classes, which has not been sufficiently explored yet. To remedy this, we study a new problem setup, namely Learning with Open-world Noisy Data (LOND). The goal of LOND is to simultaneously learn a classifier and an OOD detector from datasets with mixed IND and OOD noise. In this paper, we propose a new graph-based framework, namely Noisy Graph Cleaning (NGC), which collects clean samples by leveraging geometric structure of data and model predictive confidence. Without any additional training effort, NGC can detect and reject the OOD samples based on the learned class prototypes directly in testing phase. We conduct experiments on multiple benchmarks with different types of noise and the results demonstrate the superior performance of our method against state of the arts.


page 1

page 12

page 13

page 16


Iterative Learning with Open-set Noisy Labels

Large-scale datasets possessing clean label annotations are crucial for ...

LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels

In real-world applications, perfect labels are rarely available, making ...

EvidentialMix: Learning with Combined Open-set and Closed-set Noisy Labels

The efficacy of deep learning depends on large-scale data sets that have...

Combating Noisy Labels in Long-Tailed Image Classification

Most existing methods that cope with noisy labels usually assume that th...

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

Deep models trained with noisy labels are prone to over-fitting and stru...

Adaptive Preferential Attached kNN Graph with Distribution-Awareness

Graph-based kNN algorithms have garnered widespread popularity for machi...

Towards a New Understanding of the Training of Neural Networks with Mislabeled Training Data

We investigate the problem of machine learning with mislabeled training ...

Please sign up or login with your details

Forgot password? Click here to reset