What you can cram into a single vector: Probing sentence embeddings for linguistic properties

05/03/2018
by   Alexis Conneau, et al.
0

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing. "Downstream" tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations. The complexity of the tasks makes it however difficult to infer what kind of information is present in the representations. We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of both encoders and training methods.

READ FULL TEXT
research
06/16/2018

Evaluation of sentence embeddings in downstream and linguistic probing tasks

Despite the fast developmental pace of new sentence embedding methods, i...
research
11/01/2020

Vec2Sent: Probing Sentence Embeddings with Natural Language Generation

We introspect black-box sentence embeddings by conditionally generating ...
research
09/07/2023

The Daunting Dilemma with Sentence Encoders: Success on Standard Benchmarks, Failure in Capturing Basic Semantic Properties

In this paper, we adopted a retrospective approach to examine and compar...
research
03/09/2020

Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings

While important properties of word vector representations have been stud...
research
08/11/2018

Fake Sentence Detection as a Training Task for Sentence Encoding

Sentence encoders are typically trained on language modeling tasks which...
research
04/03/2019

The Effect of Downstream Classification Tasks for Evaluating Sentence Embeddings

One popular method for quantitatively evaluating the performance of sent...
research
05/17/2023

Bike2Vec: Vector Embedding Representations of Road Cycling Riders and Races

Vector embeddings have been successfully applied in several domains to o...

Please sign up or login with your details

Forgot password? Click here to reset