Recurrent Stacking of Layers for Compact Neural Machine Translation Models

07/14/2018
by   Raj Dabre, et al.
0

In Neural Machine Translation (NMT), the most common practice is to stack a number of recurrent or feed-forward layers in the encoder and the decoder. As a result, the addition of each new layer improves the translation quality significantly. However, this also leads to a significant increase in the number of parameters. In this paper we propose to share parameters across all the layers thereby leading to a recurrently stacked NMT model. We empirically show that the translation quality of a model that recurrently stacks a single layer 6 times is comparable to the translation quality of a model that stacks 6 separate layers. We also show that using back-translated parallel corpora as additional data leads to further significant improvements in translation quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2021

Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation

In deep neural network modeling, the most common practice is to stack a ...
research
10/06/2020

On the Sparsity of Neural Machine Translation Models

Modern neural machine translation (NMT) models employ a large number of ...
research
10/22/2020

Not all parameters are born equal: Attention is mostly what you need

Transformers are widely used in state-of-the-art machine translation, bu...
research
04/11/2017

Unfolding and Shrinking Neural Machine Translation Ensembles

Ensembling is a well-known technique in neural machine translation (NMT)...
research
01/23/2017

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

The capacity of a neural network to absorb information is limited by its...
research
02/15/2021

Meta Back-translation

Back-translation is an effective strategy to improve the performance of ...
research
09/27/2019

In-training Matrix Factorization for Parameter-frugal Neural Machine Translation

In this paper, we propose the use of in-training matrix factorization to...

Please sign up or login with your details

Forgot password? Click here to reset