Patterns of Lexical Ambiguity in Contextualised Language Models

09/27/2021
by   Janosch Haber, et al.
0

One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts. In this paper we investigate the extent to which the contextualised embeddings of word forms that display multiplicity of sense reflect traditional distinctions of polysemy and homonymy. To this end, we introduce an extended, human-annotated dataset of graded word sense similarity and co-predication acceptability, and evaluate how well the similarity of embeddings predicts similarity in meaning. Both types of human judgements indicate that the similarity of polysemic interpretations falls in a continuum between identity of meaning and homonymy. However, we also observe significant differences within the similarity ratings of polysemes, forming consistent patterns for different types of polysemic sense alternation. Our dataset thus appears to capture a substantial part of the complexity of lexical ambiguity, and can provide a realistic test bed for contextualised embeddings. Among the tested models, BERT Large shows the strongest correlation with the collected word sense similarity ratings, but struggles to consistently replicate the observed similarity patterns. When clustering ambiguous word forms based on their embeddings, the model displays high confidence in discerning homonyms and some types of polysemic alternations, but consistently fails for others.

READ FULL TEXT

page 8

page 14

research
05/27/2021

RAW-C: Relatedness of Ambiguous Words–in Context (A New Lexical Resource for English)

Most words are ambiguous–i.e., they convey distinct meanings in differen...
research
11/21/2020

Sensing Ambiguity in Henry James' "The Turn of the Screw"

Fields such as the philosophy of language, continental philosophy, and l...
research
01/17/2022

On the Context-Free Ambiguity of Emoji: A Data-Driven Study of 1,289 Emojis

Emojis come with prepacked semantics making them great candidates to cre...
research
10/05/2020

Speakers Fill Lexical Semantic Gaps with Context

Lexical ambiguity is widespread in language, allowing for the reuse of e...
research
03/10/2022

Contextualized Sensorimotor Norms: multi-dimensional measures of sensorimotor strength for ambiguous English words, in context

Most large language models are trained on linguistic input alone, yet hu...
research
05/15/2016

A Proposal for Linguistic Similarity Datasets Based on Commonality Lists

Similarity is a core notion that is used in psychology and two branches ...
research
07/18/2017

Detecting Intentional Lexical Ambiguity in English Puns

The article describes a model of automatic analysis of puns, where a wor...

Please sign up or login with your details

Forgot password? Click here to reset