Evaluating Biased Attitude Associations of Language Models in an Intersectional Context

07/07/2023
by   Shiva Omrani Sabbaghi, et al.
0

Language models are trained on large-scale corpora that embed implicit biases documented in psychology. Valence associations (pleasantness/unpleasantness) of social groups determine the biased attitudes towards groups and concepts in social cognition. Building on this established literature, we quantify how social groups are valenced in English language models using a sentence template that provides an intersectional context. We study biases related to age, education, gender, height, intelligence, literacy, race, religion, sex, sexual orientation, social class, and weight. We present a concept projection approach to capture the valence subspace through contextualized word embeddings of language models. Adapting the projection-based approach to embedding association tests that quantify bias, we find that language models exhibit the most biased attitudes against gender identity, social class, and sexual orientation signals in language. We find that the largest and better-performing model that we study is also more biased as it effectively captures bias embedded in sociocultural data. We validate the bias evaluation method by overperforming on an intrinsic valence evaluation task. The approach enables us to measure complex intersectional biases as they are known to manifest in the outputs and applications of language models that perpetuate historical biases. Moreover, our approach contributes to design justice as it studies the associations of groups underrepresented in language such as transgender and homosexual individuals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2020

StereoSet: Measuring stereotypical bias in pretrained language models

A stereotype is an over-generalized belief about a particular group of p...
research
09/24/2022

Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity

Large Language Models (LLMs) have recently demonstrated impressive capab...
research
06/23/2022

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

NLP models trained on text have been shown to reproduce human stereotype...
research
01/31/2023

Debiasing Vision-Language Models via Biased Prompts

Machine learning models have been shown to inherit biases from their tra...
research
05/12/2022

Using Natural Sentences for Understanding Biases in Language Models

Evaluation of biases in language models is often limited to syntheticall...
research
08/24/2023

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

Recent researches indicate that Pre-trained Large Language Models (LLMs)...
research
01/29/2021

Challenges in Automated Debiasing for Toxic Language Detection

Biased associations have been a challenge in the development of classifi...

Please sign up or login with your details

Forgot password? Click here to reset