A Noise-Robust Loss for Unlabeled Entity Problem in Named Entity Recognition

08/05/2022
by   Wentao Kang, et al.
1

Named Entity Recognition (NER) is an important task in natural language processing. However, traditional supervised NER requires large-scale annotated datasets. Distantly supervision is proposed to alleviate the massive demand for datasets, but datasets constructed in this way are extremely noisy and have a serious unlabeled entity problem. The cross entropy (CE) loss function is highly sensitive to unlabeled data, leading to severe performance degradation. As an alternative, we propose a new loss function called NRCES to cope with this problem. A sigmoid term is used to mitigate the negative impact of noise. In addition, we balance the convergence and noise tolerance of the model according to samples and the training process. Experiments on synthetic and real-world datasets demonstrate that our approach shows strong robustness in the case of severe unlabeled entity problem, achieving new state-of-the-art on real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2020

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...
research
08/26/2021

Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition

In many situations (e.g., distant supervision), unlabeled entity problem...
research
12/15/2019

Robust Named Entity Recognition with Truecasing Pretraining

Although modern named entity recognition (NER) systems show impressive p...
research
09/04/2022

SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER

Named Entity Recognition is the task to locate and classify the entities...
research
05/22/2023

Better Sampling of Negatives for Distantly Supervised Named Entity Recognition

Distantly supervised named entity recognition (DS-NER) has been proposed...
research
05/06/2023

SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition

Distantly-Supervised Named Entity Recognition effectively alleviates the...
research
05/06/2020

Collective Loss Function for Positive and Unlabeled Learning

People learn to discriminate between classes without explicit exposure t...

Please sign up or login with your details

Forgot password? Click here to reset