DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

05/24/2022
by   Gabriele Sarti, et al.
0

We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keystrokes, editing times, pauses, and perceived effort were recorded, enabling an in-depth, cross-lingual evaluation of NMT quality and its post-editing process. Using this new dataset, we assess the impact on translation productivity of two state-of-the-art NMT systems, namely: Google Translate and the open-source multilingual model mBART50. We find that, while post-editing is consistently faster than translation from scratch, the magnitude of its contribution varies largely across systems and languages, ranging from doubled productivity in Dutch and Italian to marginal gains in Arabic, Turkish and Ukrainian, for some of the evaluated modalities. Moreover, the observed cross-language variability appears to partly reflect source-target relatedness and type of target morphology, while remaining hard to predict even based on state-of-the-art automatic MT quality metrics. We publicly release the complete dataset, including all collected behavioural data, to foster new research on the ability of state-of-the-art NMT systems to generate text in typologically diverse languages.

READ FULL TEXT

page 9

page 18

research
05/24/2023

Leveraging GPT-4 for Automatic Translation Post-Editing

While Neural Machine Translation (NMT) represents the leading approach t...
research
06/04/2019

Post-editing Productivity with Neural Machine Translation: An Empirical Assessment of Speed and Quality in the Banking and Finance Domain

Neural machine translation (NMT) has set new quality standards in automa...
research
09/10/2021

Neural Machine Translation Quality and Post-Editing Performance

We test the natural expectation that using MT in professional translatio...
research
12/13/2017

A User-Study on Online Adaptation of Neural Machine Translation to Human Post-Edits

The advantages of neural machine translation (NMT) have been extensively...
research
09/07/2016

Feasibility of Post-Editing Speech Transcriptions with a Mismatched Crowd

Manual correction of speech transcription can involve a selection from p...
research
06/15/2017

Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation

This work presents a novel approach to Automatic Post-Editing (APE) and ...
research
05/31/2021

Verdi: Quality Estimation and Error Detection for Bilingual

Translation Quality Estimation is critical to reducing post-editing effo...

Please sign up or login with your details

Forgot password? Click here to reset