Extrapolating Multilingual Understanding Models as Multilingual Generators

05/22/2023
by   Bohong Wu, et al.
0

Multilingual understanding models (or encoder-based), pre-trained via masked language modeling, have achieved promising results on many language understanding tasks (e.g., mBERT). However, these non-autoregressive (NAR) models still struggle to generate high-quality texts compared with autoregressive (AR) models. Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. Specifically, we start from a multilingual encoder (XLM-R) and propose a Semantic-Guided Alignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters. Experiments show that the proposed approach is an effective adaption method, outperforming widely-used initialization-based methods with gains of 9.4 BLEU on machine translation, 8.1 Rouge-L on question generation, and 5.5 METEOR on story generation on XLM-R_large. On the other hand, we observe that XLM-R is still inferior to mBART in supervised settings despite better results on zero-shot settings, indicating that more exploration is required to make understanding models strong generators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2020

Enabling Zero-shot Multilingual Spoken Language Translation with Language-Specific Encoders and Decoders

Current end-to-end approaches to Spoken Language Translation (SLT) rely ...
research
10/25/2022

Multilingual Relation Classification via Efficient and Effective Prompting

Prompting pre-trained language models has achieved impressive performanc...
research
02/01/2022

Examining Scaling and Transfer of Language Model Architectures for Machine Translation

Natural language understanding and generation models follow one of the t...
research
12/16/2021

Can Multilinguality benefit Non-autoregressive Machine Translation?

Non-autoregressive (NAR) machine translation has recently achieved signi...
research
07/28/2023

Multilingual Lexical Simplification via Paraphrase Generation

Lexical simplification (LS) methods based on pretrained language models ...
research
11/27/2016

Semi Supervised Preposition-Sense Disambiguation using Multilingual Data

Prepositions are very common and very ambiguous, and understanding their...

Please sign up or login with your details

Forgot password? Click here to reset