Improving Character-level Japanese-Chinese Neural Machine Translation with Radicals as an Additional Input Feature

05/08/2018
by   Jinyi Zhang, et al.
0

In recent years, Neural Machine Translation (NMT) has been proven to get impressive results. While some additional linguistic features of input words improve word-level NMT, any additional character features have not been used to improve character-level NMT so far. In this paper, we show that the radicals of Chinese characters (or kanji), as a character feature information, can be easily provide further improvements in the character-level NMT. In experiments on WAT2016 Japanese-Chinese scientific paper excerpt corpus (ASPEC-JP), we find that the proposed method improves the translation quality according to two aspects: perplexity and BLEU. The character-level NMT with the radical input feature's model got a state-of-the-art result of 40.61 BLEU points in the test set, which is an improvement of about 8.6 BLEU points over the best system on the WAT2016 Japanese-to-Chinese translation subtask with ASPEC-JP. The improvements over the character-level NMT with no additional input feature are up to about 1.5 and 1.4 BLEU points in the development-test set and the test set of the corpus, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2018

Apply Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level

In neural machine translation (NMT), researchers face the challenge of u...
research
09/13/2020

Combining Word and Character Vector Representation on Neural Machine Translation

This paper describes combinations of word vector representation and char...
research
11/18/2020

Master Thesis: Neural Sign Language Translation by Learning Tokenization

In this thesis, we propose a multitask learning based method to improve ...
research
08/14/2019

Adabot: Fault-Tolerant Java Decompiler

Reverse Engineering(RE) has been a fundamental task in software engineer...
research
11/02/2018

Improving the Robustness of Speech Translation

Although neural machine translation (NMT) has achieved impressive progre...
research
11/12/2020

Inference-only sub-character decomposition improves translation of unseen logographic characters

Neural Machine Translation (NMT) on logographic source languages struggl...
research
08/28/2018

A Tree-based Decoder for Neural Machine Translation

Recent advances in Neural Machine Translation (NMT) show that adding syn...

Please sign up or login with your details

Forgot password? Click here to reset