Comparing Biases and the Impact of Multilingual Training across Multiple Languages

05/18/2023
by   Sharon Levy, et al.
0

Studies in bias and fairness in natural language processing have primarily examined social biases within a single language and/or across few attributes (e.g. gender, race). However, biases can manifest differently across various languages for individual attributes. As a result, it is critical to examine biases within each language and attribute. Of equal importance is to study how these biases compare across languages and how the biases are affected when training a model on multilingual data versus monolingual data. We present a bias analysis across Italian, Chinese, English, Hebrew, and Spanish on the downstream sentiment analysis task to observe whether specific demographics are viewed more positively. We study bias similarities and differences across these languages and investigate the impact of multilingual vs. monolingual training data. We adapt existing sentiment bias templates in English to Italian, Chinese, Hebrew, and Spanish for four attributes: race, religion, nationality, and gender. Our results reveal similarities in bias expression such as favoritism of groups that are dominant in each language's culture (e.g. majority religions and nationalities). Additionally, we find an increased variation in predictions across protected groups, indicating bias amplification, after multilingual finetuning in comparison to multilingual pretraining.

READ FULL TEXT

page 7

page 14

page 16

page 17

page 18

page 19

research
07/04/2023

On Evaluating and Mitigating Gender Biases in Multilingual Settings

While understanding and removing gender biases in language models has be...
research
04/07/2022

Mapping the Multilingual Margins: Intersectional Biases of Sentiment Analysis Systems in English, Spanish, and Arabic

As natural language processing systems become more widespread, it is nec...
research
05/11/2018

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

Automatic machine learning systems can inadvertently accentuate and perp...
research
05/22/2023

Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis

Sentiment analysis (SA) systems are widely deployed in many of the world...
research
10/21/2020

Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia

Specific lexical choices in how people are portrayed both reflect the wr...
research
01/03/2023

Average Is Not Enough: Caveats of Multilingual Evaluation

This position paper discusses the problem of multilingual evaluation. Us...
research
05/24/2023

This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models

We introduce the notion of geopolitical bias – a tendency to report diff...

Please sign up or login with your details

Forgot password? Click here to reset