VAST: The Valence-Assessing Semantics Test for Contextualizing Language Models

03/14/2022
by   Robert Wolfe, et al.
0

VAST, the Valence-Assessing Semantics Test, is a novel intrinsic evaluation task for contextualized word embeddings (CWEs). VAST uses valence, the association of a word with pleasantness, to measure the correspondence of word-level LM semantics with widely used human judgments, and examines the effects of contextualization, tokenization, and LM-specific geometry. Because prior research has found that CWEs from GPT-2 perform poorly on other intrinsic evaluations, we select GPT-2 as our primary subject, and include results showing that VAST is useful for 7 other LMs, and can be used in 7 languages. GPT-2 results show that the semantics of a word incorporate the semantics of context in layers closer to model output, such that VAST scores diverge between our contextual settings, ranging from Pearson's rho of .55 to .77 in layer 11. We also show that multiply tokenized words are not semantically encoded until layer 8, where they achieve Pearson's rho of .46, indicating the presence of an encoding process for multiply tokenized words which differs from that of singly tokenized words, for which rho is highest in layer 0. We find that a few neurons with values having greater magnitude than the rest mask word-level semantics in GPT-2's top layer, but that word-level semantics can be recovered by nullifying non-semantic principal components: Pearson's rho in the top layer improves from .32 to .76. After isolating semantics, we show the utility of VAST for understanding LM semantics via improvements over related work on four word similarity tasks, with a score of .50 on SimLex-999, better than the previous best of .45 for GPT-2. Finally, we show that 8 of 10 WEAT bias tests, which compare differences in word embedding associations between groups of words, exhibit more stereotype-congruent biases after isolating semantics, indicating that non-semantic structures in LMs also mask biases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2020

ValNorm: A New Word Embedding Intrinsic Evaluation Method Reveals Valence Biases are Consistent Across Languages and Over Decades

Word embeddings learn implicit biases from linguistic regularities captu...
research
10/30/2020

"Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings

With the vast development and employment of artificial intelligence appl...
research
06/07/2022

Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics

The statistical regularities in language corpora encode well-known socia...
research
08/28/2018

WiC: 10,000 Example Pairs for Evaluating Context-Sensitive Representations

By design, word embeddings are unable to model the dynamic nature of wor...
research
03/14/2022

Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations

We examine the effects of contrastive visual semantic pretraining by com...
research
08/25/2016

Semantics derived automatically from language corpora contain human-like biases

Artificial intelligence and machine learning are in a period of astoundi...

Please sign up or login with your details

Forgot password? Click here to reset