Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

08/30/2018
by   Antonio Toral, et al.
0

We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context. If we consider only original source text (i.e. not translated from another language, or translationese), then we find evidence showing that human parity has not been achieved. We compare the judgments of professional translators against those of non-experts and discover that those of the experts result in higher inter-annotator agreement and better discrimination between human and machine translations. In addition, we analyse the human translations of the test set and identify important translation issues. Finally, based on these findings, we provide a set of recommendations for future human evaluations of MT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2018

Achieving Human Parity on Automatic Chinese to English News Translation

Machine translation has made rapid advances in recent years. Millions of...
research
06/24/2019

Translationese in Machine Translation Evaluation

The term translationese has been used to describe the presence of unusua...
research
03/31/2020

On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation

New machine translations (MT) technologies are emerging rapidly and with...
research
05/12/2020

Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

We reassess the claims of human parity and super-human performance made ...
research
03/28/2019

Train, Sort, Explain: Learning to Diagnose Translation Models

Evaluating translation models is a trade-off between effort and detail. ...
research
04/03/2020

A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

The quality of machine translation has increased remarkably over the pas...
research
09/02/2018

Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting

A popular application of machine translation (MT) is gisting: MT is cons...

Please sign up or login with your details

Forgot password? Click here to reset