What about em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns

05/25/2023
by   Anne Lauscher, et al.
1

As 3rd-person pronoun usage shifts to include novel forms, e.g., neopronouns, we need more research on identity-inclusive NLP. Exclusion is particularly harmful in one of the most popular NLP applications, machine translation (MT). Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals (Dev et al., 2021). In this “reality check”, we study how three commercial MT systems translate 3rd-person pronouns. Concretely, we compare the translations of gendered vs. gender-neutral pronouns from English to five other languages (Danish, Farsi, French, German, Italian), and vice versa, from Danish to English. Our error analysis shows that the presence of a gender-neutral pronoun often leads to grammatical and semantic translation errors. Similarly, gender neutrality is often not preserved. By surveying the opinions of affected native speakers from diverse languages, we provide recommendations to address the issue in future MT research.

READ FULL TEXT

page 6

page 7

page 14

research
06/09/2023

Good, but not always Fair: An Evaluation of Gender Bias for three commercial Machine Translation Systems

Machine Translation (MT) continues to make significant strides in qualit...
research
11/02/2022

MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

As generic machine translation (MT) quality has improved, the need for t...
research
10/12/2020

Gender Coreference and Bias Evaluation at WMT 2020

Gender bias in machine translation can manifest when choosing gender inf...
research
06/15/2023

Participatory Research as a Path to Community-Informed, Gender-Fair Machine Translation

Recent years have seen a strongly increased visibility of non-binary peo...
research
09/07/2022

Facilitating Global Team Meetings Between Language-Based Subgroups: When and How Can Machine Translation Help?

Global teams frequently consist of language-based subgroups who put toge...
research
05/09/2022

CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

The machine translation (MT) task is typically formulated as that of ret...
research
10/18/2021

The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses

Gender bias in natural language processing (NLP) applications, particula...

Please sign up or login with your details

Forgot password? Click here to reset