Reinforced Curriculum Learning on Pre-trained Neural Machine Translation Models

04/13/2020
by   Mingjun Zhao, et al.
0

The competitive performance of neural machine translation (NMT) critically relies on large amounts of training data. However, acquiring high-quality translation pairs requires expert knowledge and is costly. Therefore, how to best utilize a given dataset of samples with diverse quality and characteristics becomes an important yet understudied question in NMT. Curriculum learning methods have been introduced to NMT to optimize a model's performance by prescribing the data input order, based on heuristics such as the assessment of noise and difficulty levels. However, existing methods require training from scratch, while in practice most NMT models are pre-trained on big data already. Moreover, as heuristics, they do not generalize well. In this paper, we aim to learn a curriculum for improving a pre-trained NMT model by re-selecting influential data samples from the original training set and formulate this task as a reinforcement learning problem. Specifically, we propose a data selection framework based on Deterministic Actor-Critic, in which a critic network predicts the expected change of model performance due to a certain sample, while an actor network learns to select the best sample out of a random batch of samples presented to it. Experiments on several translation datasets show that our method can further improve the performance of NMT when original batch training reaches its ceiling, without using additional new training data, and significantly outperforms several strong baseline methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2022

Data Selection Curriculum for Neural Machine Translation

Neural Machine Translation (NMT) models are typically trained on heterog...
research
03/23/2019

Competence-based Curriculum Learning for Neural Machine Translation

Current state-of-the-art NMT systems use large neural networks that are ...
research
09/23/2019

Data Ordering Patterns for Neural Machine Translation: An Empirical Study

Recent works show that ordering of the training data affects the model p...
research
11/30/2020

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Large amounts of data has made neural machine translation (NMT) a big su...
research
04/07/2020

Self-Induced Curriculum Learning in Neural Machine Translation

Self-supervised neural machine translation (SS-NMT) learns how to extrac...
research
11/22/2019

Optimizing Data Usage via Differentiable Rewards

To acquire a new skill, humans learn better and faster if a tutor, based...
research
08/24/2021

Density-Based Dynamic Curriculum Learning for Intent Detection

Pre-trained language models have achieved noticeable performance on the ...

Please sign up or login with your details

Forgot password? Click here to reset