Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

12/20/2022
by   Fei Yuan, et al.
0

Traditional multilingual neural machine translation (MNMT) uses a single model to translate all directions. However, with the increasing scale of language pairs, simply using a single model for massive MNMT brings new challenges: parameter tension and large computations. In this paper, we revisit multi-way structures by assigning an individual branch for each language (group). Despite being a simple architecture, it is challenging to train de-centralized models due to the lack of constraints to align representations from all languages. We propose a localized training recipe to map different branches into a unified space, resulting in an efficient detachable model, Lego-MT. For a fair comparison, we collect data from OPUS and build the first large-scale open-source translation benchmark covering 7 language-centric data, each containing 445 language pairs. Experiments show that Lego-MT (1.2B) brings gains of more than 4 BLEU while outperforming M2M-100 (12B) (We will public all training data, models, and checkpoints)

READ FULL TEXT
research
10/21/2020

Beyond English-Centric Multilingual Machine Translation

Existing work in translation demonstrated the potential of massively mul...
research
10/21/2022

University of Cape Town's WMT22 System: Multilingual Machine Translation for Southern African Languages

The paper describes the University of Cape Town's submission to the cons...
research
05/19/2021

Learning Language Specific Sub-network for Multilingual Machine Translation

Multilingual neural machine translation aims at learning a single transl...
research
05/14/2022

Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning

Machine translation (MT) involving Indigenous languages, including those...
research
10/07/2020

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

We investigate the following question for machine translation (MT): can ...
research
11/14/2022

Findings of the Covid-19 MLIA Machine Translation Task

This work presents the results of the machine translation (MT) task from...
research
09/22/2021

Scalable and Efficient MoE Training for Multitask Multilingual Models

The Mixture of Experts (MoE) models are an emerging class of sparsely ac...

Please sign up or login with your details

Forgot password? Click here to reset