Multilingual Machine Translation with Hyper-Adapters

05/22/2022
by   Christos Baziotis, et al.
0

Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters – hyper-networks that generate adapters from language and layer embeddings. While past work had poor results when scaling hyper-networks, we propose a rescaling fix that significantly improves convergence and enables training larger hyper-networks. We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters. When using the same number of parameters and FLOPS, our approach consistently outperforms regular adapters. Also, hyper-adapters converge faster than alternative approaches and scale better than regular dense networks. Our analysis shows that hyper-adapters learn to encode language relatedness, enabling positive transfer across languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2022

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

Massively multilingual models are promising for transfer learning across...
research
01/06/2016

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

We propose multi-way, multilingual neural machine translation. The propo...
research
09/30/2022

Language-Family Adapters for Multilingual Neural Machine Translation

Massively multilingual models pretrained on abundant corpora with self-s...
research
04/15/2021

Adaptive Sparse Transformer for Multilingual Translation

Multilingual machine translation has attracted much attention recently d...
research
03/05/2021

Hierarchical Transformer for Multilingual Machine Translation

The choice of parameter sharing strategy in multilingual machine transla...
research
05/23/2023

Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

Incorporating language-specific (LS) modules is a proven method to boost...
research
04/30/2020

Bridging linguistic typology and multilingual machine translation with multi-view language representations

Sparse language vectors from linguistic typology databases and learned e...

Please sign up or login with your details

Forgot password? Click here to reset