Incorporating Global Visual Features into Attention-Based Neural Machine Translation

01/23/2017
by   Iacer Calixto, et al.
0

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. We utilise global image features extracted using a pre-trained convolutional neural network and incorporate them (i) as words in the source sentence, (ii) to initialise the encoder hidden state, and (iii) as additional data to initialise the decoder hidden state. In our experiments, we evaluate how these different strategies to incorporate global image features compare and which ones perform best. We also study the impact that adding synthetic multi-modal, multilingual data brings and find that the additional data have a positive impact on multi-modal models. We report new state-of-the-art results and our best models also significantly improve on a comparable phrase-based Statistical MT (PBSMT) model trained on the Multi30k data set according to all metrics evaluated. To the best of our knowledge, it is the first time a purely neural model significantly improves over a PBSMT model on all metrics evaluated on this data set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2017

Doubly-Attentive Decoder for Multi-modal Neural Machine Translation

We introduce a Multi-modal Neural Machine Translation model in which a d...
research
07/17/2020

A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation

Multi-modal neural machine translation (NMT) aims to translate source se...
research
03/16/2021

Gumbel-Attention for Multi-modal Machine Translation

Multi-modal machine translation (MMT) improves translation quality by in...
research
03/04/2021

An empirical analysis of phrase-based and neural machine translation

Two popular types of machine translation (MT) are phrase-based and neura...
research
08/31/2018

The MeMAD Submission to the WMT18 Multimodal Translation Task

This paper describes the MeMAD project entry to the WMT Multimodal Machi...
research
09/06/2017

Information-Propogation-Enhanced Neural Machine Translation by Relation Model

Even though sequence-to-sequence neural machine translation (NMT) model ...
research
02/03/2017

Multilingual Multi-modal Embeddings for Natural Language Processing

We propose a novel discriminative model that learns embeddings from mult...

Please sign up or login with your details

Forgot password? Click here to reset