An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-Trained Language Models

10/16/2021
by   Nicholas Meade, et al.
0

Recent work has shown that pre-trained language models capture social biases from the text corpora they are trained on. This has attracted attention to developing techniques that mitigate such biases. In this work, we perform a empirical survey of five recently proposed debiasing techniques: Counterfactual Data Augmentation (CDA), Dropout, Iterative Nullspace Projection, Self-Debias, and SentenceDebias. We quantify the effectiveness of each technique using three different bias benchmarks while also measuring the impact of these techniques on a model's language modeling ability, as well as its performance on downstream NLU tasks. We experimentally find that: (1) CDA and Self-Debias are the strongest of the debiasing techniques, obtaining improved scores on most of the bias benchmarks (2) Current debiasing techniques do not generalize well beyond gender bias; And (3) improvements on bias benchmarks such as StereoSet and CrowS-Pairs by using debiasing strategies are usually accompanied by a decrease in language modeling ability, making it difficult to determine whether the bias mitigation is effective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2021

Impact of Gender Debiased Word Embeddings in Language Modeling

Gender, race and social biases have recently been detected as evident ex...
research
04/17/2023

Effectiveness of Debiasing Techniques: An Indigenous Qualitative Analysis

An indigenous perspective on the effectiveness of debiasing techniques f...
research
02/11/2023

Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns

Bias-measuring datasets play a critical role in detecting biased behavio...
research
01/22/2023

An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from m...
research
07/04/2023

Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases

As the representation capability of Pre-trained Language Models (PLMs) i...
research
11/05/2022

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

Fairness has become a trending topic in natural language processing (NLP...
research
07/03/2022

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models

Vision-Language Pre-training (VLP) models have achieved state-of-the-art...

Please sign up or login with your details

Forgot password? Click here to reset