Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation

05/30/2022
by   Verna Dankers, et al.
3

Unlike literal expressions, idioms' meanings do not directly follow from their parts, posing a challenge for neural machine translation (NMT). NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations. In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. When Transformer emits a non-literal translation - i.e. identifies the expression as idiomatic - the encoder processes idioms more strongly as single lexical units compared to literal expressions. This manifests in idioms' parts being grouped through attention and in reduced interaction between idioms and their context. In the decoder's cross-attention, figurative inputs result in reduced attention on source-side tokens. These results suggest that Transformer's tendency to process idioms as compositional expressions contributes to literal translations of idioms.

READ FULL TEXT

page 15

page 16

research
10/17/2018

An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation

Recent work has shown that the encoder-decoder attention mechanisms in n...
research
03/28/2021

PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation

In this paper we present our submission for the EACL 2021 SRW; a methodo...
research
06/25/2022

Probing Causes of Hallucinations in Neural Machine Translations

Hallucination, one kind of pathological translations that bothers Neural...
research
04/16/2021

Investigating Failures of Automatic Translation in the Case of Unambiguous Gender

Transformer based models are the modern work horses for neural machine t...
research
08/26/2023

Translate Meanings, Not Just Words: IdiomKB's Role in Optimizing Idiomatic Translation with Language Models

To translate well, machine translation (MT) systems and general-purposed...
research
09/30/2021

Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry

Persian Poetry has consistently expressed its philosophy, wisdom, speech...
research
08/16/2023

Fast Training of NMT Model with Data Sorting

The Transformer model has revolutionized Natural Language Processing tas...

Please sign up or login with your details

Forgot password? Click here to reset