Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

12/04/2020
by   Na Li, et al.
0

While the success of pre-trained language models has largely eliminated the need for high-quality static word vectors in many NLP applications, static word vectors continue to play an important role in tasks where word meaning needs to be modelled in the absence of linguistic context. In this paper, we explore how the contextualised embeddings predicted by BERT can be used to produce high-quality word vectors for such domains, in particular related to knowledge base completion, where our focus is on capturing the semantic properties of nouns. We find that a simple strategy of averaging the contextualised embeddings of masked word mentions leads to vectors that outperform the static word vectors learned by BERT, as well as those from standard word embedding models, in property induction tasks. We notice in particular that masking target words is critical to achieve this strong performance, as the resulting vectors focus less on idiosyncratic properties and more on general semantic properties. Inspired by this view, we propose a filtering strategy which is aimed at removing the most idiosyncratic mention vectors, allowing us to obtain further performance gains in property induction.

READ FULL TEXT
research
06/15/2021

Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection

One of the long-standing challenges in lexical semantics consists in lea...
research
05/16/2023

Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models

Learning vectors that capture the meaning of concepts remains a fundamen...
research
04/12/2021

Learning to Remove: Towards Isotropic Pre-trained BERT Embedding

Pre-trained language models such as BERT have become a more common choic...
research
05/30/2018

What the Vec? Towards Probabilistically Grounded Embeddings

Vector representation, or embedding, of words is commonly achieved with ...
research
11/09/2020

Catch the "Tails" of BERT

Recently, contextualized word embeddings outperform static word embeddin...
research
04/14/2021

Static Embeddings as Efficient Knowledge Bases?

Recent research investigates factual knowledge stored in large pretraine...
research
05/25/2023

Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained language models

Vector space models of word meaning all share the assumption that words ...

Please sign up or login with your details

Forgot password? Click here to reset