How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus

10/30/2021
by   Chanjun Park, et al.
0

This paper proposes a tool for efficiently constructing high-quality parallel corpora with minimizing human labor and making this tool publicly available. Our proposed construction process is based on neural machine translation (NMT) to allow for it to not only coexist with human translation, but also improve its efficiency by combining data quality control with human translation in a data-centric approach.

READ FULL TEXT

page 1

page 2

page 3

research
01/07/2023

Building a Parallel Corpus and Training Translation Models Between Luganda and English

Neural machine translation (NMT) has achieved great successes with large...
research
04/17/2018

Investigating Backtranslation in Neural Machine Translation

A prerequisite for training corpus-based machine translation (MT) system...
research
11/01/2021

A New Tool for Efficiently Generating Quality Estimation Datasets

Building of data for quality estimation (QE) training is expensive and r...
research
04/05/2020

AR: Auto-Repair the Synthetic Data for Neural Machine Translation

Compared with only using limited authentic parallel data as training cor...
research
10/28/2021

Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC

Machine translation (MT) system aims to translate source language into t...
research
03/10/2021

Majority Voting with Bidirectional Pre-translation For Bitext Retrieval

Obtaining high-quality parallel corpora is of paramount importance for t...
research
10/05/2020

On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

Commit messages play an important role in software maintenance and evolu...

Please sign up or login with your details

Forgot password? Click here to reset