Regularizing Models via Pointwise Mutual Information for Named Entity Recognition

04/15/2021
by   Minbyul Jeong, et al.
0

In Named Entity Recognition (NER), pre-trained language models have been overestimated by focusing on dataset biases to solve current benchmark datasets. However, these biases hinder generalizability which is necessary to address real-world situations such as weak name regularity and plenty of unseen mentions. To alleviate the use of dataset biases and make the models fully exploit data, we propose a debiasing method that our bias-only model can be replaced with a Pointwise Mutual Information (PMI) to enhance generalization ability while outperforming an in-domain performance. Our approach enables to debias highly correlated word and labels in the benchmark datasets; reflect informative statistics via subword frequency; alleviates a class imbalance between positive and negative examples. For long-named and complex-structure entities, our method can predict these entities through debiasing on conjunction or special characters. Extensive experiments on both general and biomedical domains demonstrate the effectiveness and generalization capabilities of the PMI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/01/2021

How Do Your Biomedical Named Entity Models Generalize to Novel Entities?

The number of biomedical literature on new biomedical concepts is rapidl...
research
06/14/2023

Research on Named Entity Recognition in Improved transformer with R-Drop structure

To enhance the generalization ability of the model and improve the effec...
research
03/07/2022

USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition

This paper describes the system developed by the USTC-NELSLIP team for S...
research
04/08/2020

Entity-Switched Datasets: An Approach to Auditing the In-Domain Robustness of Named Entity Recognition Models

Named entity recognition systems perform well on standard datasets compr...
research
12/10/2020

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...
research
08/26/2021

Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition

In many situations (e.g., distant supervision), unlabeled entity problem...
research
04/25/2020

A Rigourous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?

Fine-tuning pretrained model has achieved promising performance on stand...

Please sign up or login with your details

Forgot password? Click here to reset