Impact of Position Bias on Language Models in Token Classification

04/26/2023
by   Mehdi Ben Amor, et al.
9

Language Models (LMs) have shown state-of-the-art performance in Natural Language Processing (NLP) tasks. Downstream tasks such as Named Entity Recognition (NER) or Part-of-Speech (POS) tagging are known to suffer from data imbalance issues, specifically in terms of the ratio of positive to negative examples, and class imbalance. In this paper, we investigate an additional specific issue for language models, namely the position bias of positive examples in token classification tasks. Therefore, we conduct an in-depth evaluation of the impact of position bias on the performance of LMs when fine-tuned on Token Classification benchmarks. Our study includes CoNLL03 and OntoNote5.0 for NER, English Tree Bank UD_en and TweeBank for POS tagging. We propose an evaluation approach to investigate position bias in Transformer models. We show that encoders like BERT, ERNIE, ELECTRA, and decoders such as GPT2 and BLOOM can suffer from this bias with an average drop of 3% and 9% in their performance. To mitigate this effect, we propose two methods: Random Position Shifting and Context Perturbation, that we apply on batches during the training process. The results show an improvement of ≈ 2% in the performance of the model on CoNLL03, UD_en, and TweeBank.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2020

GREEK-BERT: The Greeks visiting Sesame Street

Transformer-based language models, such as BERT and its variants, have a...
research
03/29/2021

Contextual Text Embeddings for Twi

Transformer-based language models have been changing the modern Natural ...
research
10/13/2022

Incorporating Context into Subword Vocabularies

Most current popular subword tokenizers are trained based on word freque...
research
04/14/2021

Zero-Resource Multi-Dialectal Arabic Natural Language Understanding

A reasonable amount of annotated data is required for fine-tuning pre-tr...
research
12/01/2021

Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition

While transfer learning has become a ubiquitous technique used across Na...
research
07/05/2021

Doing Good or Doing Right? Exploring the Weakness of Commonsense Causal Reasoning Models

Pretrained language models (PLM) achieve surprising performance on the C...
research
11/07/2019

Dice Loss for Data-imbalanced NLP Tasks

Many NLP tasks such as tagging and machine reading comprehension are fac...

Please sign up or login with your details

Forgot password? Click here to reset