Exploring Unsupervised Pretraining Objectives for Machine Translation

06/10/2021
by   Christos Baziotis, et al.
8

Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT), by drastically reducing the need for large parallel data. Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder. In this work, we systematically compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context. We pretrain models with different methods on English↔German, English↔Nepali and English↔Sinhala monolingual data, and evaluate them on NMT. In (semi-) supervised NMT, varying the pretraining objective leads to surprisingly small differences in the finetuned performance, whereas unsupervised NMT is much more sensitive to it. To understand these results, we thoroughly study the pretrained models using a series of probes and verify that they encode and use information in different ways. We conclude that finetuning on parallel data is mostly sensitive to few properties that are shared by most models, such as a strong decoder, in contrast to unsupervised NMT that also requires models with strong cross-lingual abilities.

READ FULL TEXT

page 7

page 13

page 14

page 15

research
10/30/2017

Unsupervised Neural Machine Translation

In spite of the recent success of neural machine translation (NMT) in st...
research
03/16/2022

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation

In this paper, we present a substantial step in better understanding the...
research
10/19/2020

Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation

This work presents our ongoing research of unsupervised pretraining in n...
research
08/04/2021

PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining

Despite the success of multilingual sequence-to-sequence pretraining, mo...
research
03/18/2021

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

Successful methods for unsupervised neural machine translation (UNMT) em...
research
12/20/2022

On the Role of Parallel Data in Cross-lingual Transfer Learning

While prior work has established that the use of parallel data is conduc...
research
06/03/2020

Multi-Agent Cross-Translated Diversification for Unsupervised Machine Translation

Recent unsupervised machine translation (UMT) systems usually employ thr...

Please sign up or login with your details

Forgot password? Click here to reset