Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

08/30/2021
by   Yijin Liu, et al.
0

Scheduled sampling is widely used to mitigate the exposure bias problem for neural machine translation. Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the gap between training and inference. However, vanilla scheduled sampling is merely based on training steps and equally treats all decoding steps. Namely, it simulates an inference scene with uniform error rates, which disobeys the real inference scene, where larger decoding steps usually have higher error rates due to error accumulations. To alleviate the above discrepancy, we propose scheduled sampling methods based on decoding steps, increasing the selection chance of predicted tokens with the growth of decoding steps. Consequently, we can more realistically simulate the inference scene during training, thus better bridging the gap between training and inference. Moreover, we investigate scheduled sampling based on both training steps and decoding steps for further improvements. Experimentally, our approaches significantly outperform the Transformer baseline and vanilla scheduled sampling on three large-scale WMT tasks. Additionally, our approaches also generalize well to the text summarization task on two popular benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2021

Confidence-Aware Scheduled Sampling for Neural Machine Translation

Scheduled sampling is an effective method to alleviate the exposure bias...
research
07/21/2020

Neural Machine Translation with Error Correction

Neural machine translation (NMT) generates the next target token given a...
research
06/06/2019

Syntactically Supervised Transformers for Faster Neural Machine Translation

Standard decoders for neural machine translation autoregressively genera...
research
09/19/2017

Dynamic Oracle for Neural Machine Translation in Decoding Phase

The past several years have witnessed the rapid progress of end-to-end N...
research
06/06/2019

Bridging the Gap between Training and Inference for Neural Machine Translation

Neural Machine Translation (NMT) generates target words sequentially in ...
research
02/08/2017

Trainable Greedy Decoding for Neural Machine Translation

Recent research in neural machine translation has largely focused on two...
research
06/02/2023

KL-Divergence Guided Temperature Sampling

Temperature sampling is a conventional approach to diversify large langu...

Please sign up or login with your details

Forgot password? Click here to reset