Diverse Pretrained Context Encodings Improve Document Translation

06/07/2021
by   Domenic Donato, et al.
0

We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer by incorporating multiple pretrained document context signals and assess the impact on translation performance of (1) different pretraining approaches for generating these signals, (2) the quantity of parallel data for which document context is available, and (3) conditioning on source, target, or source and target contexts. Experiments on the NIST Chinese-English, and IWSLT and WMT English-German tasks support four general conclusions: that using pretrained context representations markedly improves sample efficiency, that adequate parallel data resources are crucial for learning to use document context, that jointly conditioning on multiple context representations outperforms any single representation, and that source context is more valuable for translation performance than target side context. Our best multi-context model consistently outperforms the best existing context-aware transformers.

READ FULL TEXT
research
10/08/2018

Improving the Transformer Translation Model with Document-Level Context

Although the Transformer translation model (Vaswani et al., 2017) has ac...
research
03/30/2020

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Document-level machine translation incorporates inter-sentential depende...
research
11/10/2017

Document Context Neural Machine Translation with Memory Networks

We present a document-level neural machine translation model which takes...
research
07/30/2019

English-Czech Systems in WMT19: Document-Level Transformer

We describe our NMT systems submitted to the WMT19 shared task in Englis...
research
10/29/2019

Big Bidirectional Insertion Representations for Documents

The Insertion Transformer is well suited for long form text generation d...
research
05/15/2019

When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion

Though machine translation errors caused by the lack of context beyond o...
research
05/10/2023

Context-Aware Document Simplification

To date, most work on text simplification has focused on sentence-level ...

Please sign up or login with your details

Forgot password? Click here to reset