A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank

05/09/2022
by   Dan Malkin, et al.
0

We show that the choice of pretraining languages affects downstream cross-lingual transfer for BERT-based models. We inspect zero-shot performance in balanced data conditions to mitigate data size confounds, classifying pretraining languages that improve downstream performance as donors, and languages that are improved in zero-shot performance as recipients. We develop a method of quadratic time complexity in the number of languages to estimate these relations, instead of an exponential exhaustive computation of all possible combinations. We find that our method is effective on a diverse set of languages spanning different linguistic features and two downstream tasks. Our findings can inform developers of large-scale multilingual language models in choosing better pretraining configurations.

READ FULL TEXT

page 4

page 6

research
10/27/2021

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

While recent work on multilingual language models has demonstrated their...
research
05/01/2020

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

Massively multilingual transformers pretrained with language modeling ob...
research
08/16/2020

DeVLBert: Learning Deconfounded Visio-Linguistic Representations

In this paper, we propose to investigate the problem of out-of-domain vi...
research
05/17/2022

Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT

Multilingual BERT (mBERT), a language model pre-trained on large multili...
research
03/21/2022

Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability

Pretrained multilingual models enable zero-shot learning even for unseen...
research
10/06/2021

Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning

Multilingual models jointly pretrained on multiple languages have achiev...
research
10/16/2020

SIGTYP 2020 Shared Task: Prediction of Typological Features

Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 20...

Please sign up or login with your details

Forgot password? Click here to reset