On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning

08/21/2019
by   Yerai Doval, et al.
0

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language. Recent developments which construct these embeddings by aligning monolingual spaces have shown that accurate alignments can be obtained with little or no supervision. However, the focus has been on a particular controlled scenario for evaluation, and there is no strong evidence on how current state-of-the-art systems would fare with noisy text or for language pairs with major linguistic differences. In this paper we present an extensive evaluation over multiple cross-lingual embedding models, analyzing their strengths and limitations with respect to different variables such as target language, training corpora and amount of supervision. Our conclusions put in doubt the view that high-quality cross-lingual embeddings can always be learned without much supervision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2021

Training Cross-Lingual embeddings for Setswana and Sepedi

African languages still lag in the advances of Natural Language Processi...
research
05/17/2019

Learning Cross-lingual Embeddings from Twitter via Distant Supervision

Cross-lingual embeddings represent the meaning of words from different l...
research
05/31/2022

Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

Bilingual Word Embeddings (BWEs) are one of the cornerstones of cross-li...
research
11/01/2018

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy

Cross-lingual word embeddings aim to capture common linguistic regularit...
research
04/04/2019

Density Matching for Bilingual Word Embedding

Recent approaches to cross-lingual word embedding have generally been ba...
research
04/08/2020

Are All Good Word Vector Spaces Isomorphic?

Existing algorithms for aligning cross-lingual word vector spaces assume...
research
02/01/2019

How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions

Cross-lingual word embeddings (CLEs) enable multilingual modeling of mea...

Please sign up or login with your details

Forgot password? Click here to reset