Language-agnostic BERT Sentence Embedding

07/03/2020
by   Fangxiaoyu Feng, et al.
0

We adapt multilingual BERT to produce language-agnostic sentence embeddings for 109 languages. multilingual NLP tasks is masked language model (MLM) pretraining followed by task specific fine-tuning. While English sentence embeddings have been obtained by fine-tuning a pretrained BERT model, such models have not been applied to multilingual sentence embeddings. Our model combines masked language model (MLM) and translation language model (TLM) pretraining with a translation ranking task using bi-directional dual encoders. The resulting multilingual sentence embeddings improve average bi-text retrieval accuracy over 112 languages to 83.7 on Tatoeba. Our sentence embeddings also establish new state-of-the-art results on BUCC and UN bi-text retrieval.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2018

Universal Language Model Fine-Tuning with Subword Tokenization for Polish

Universal Language Model for Fine-tuning [arXiv:1801.06146] (ULMFiT) is ...
research
07/26/2022

Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases

Sentence embeddings are commonly used in text clustering and semantic re...
research
11/29/2020

Coarse-to-Fine Memory Matching for Joint Retrieval and Classification

We present a novel end-to-end language model for joint retrieval and cla...
research
07/09/2019

Multilingual Universal Sentence Encoder for Semantic Retrieval

We introduce two pre-trained retrieval focused multilingual sentence enc...
research
05/10/2023

LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

Text embeddings are useful features for several NLP applications, such a...
research
04/06/2022

drsphelps at SemEval-2022 Task 2: Learning idiom representations using BERTRAM

This paper describes our system for SemEval-2022 Task 2 Multilingual Idi...
research
09/15/2021

Learning to Match Job Candidates Using Multilingual Bi-Encoder BERT

In this talk, we will show how we used Randstad history of candidate pla...

Please sign up or login with your details

Forgot password? Click here to reset