Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages

08/10/2023
by   Danish Ebadulla, et al.
0

Current research in zero-shot translation is plagued by several issues such as high compute requirements, increased training time and off target translations. Proposed remedies often come at the cost of additional data or compute requirements. Pivot based neural machine translation is preferred over a single-encoder model for most settings despite the increased training and evaluation time. In this work, we overcome the shortcomings of zero-shot translation by taking advantage of transliteration and linguistic similarity. We build a single encoder-decoder neural machine translation system for Dravidian-Dravidian multilingual translation and perform zero-shot translation. We compare the data vs zero-shot accuracy tradeoff and evaluate the performance of our vanilla method against the current state of the art pivot based method. We also test the theory that morphologically rich languages require large vocabularies by restricting the vocabulary using an optimal transport based technique. Our model manages to achieves scores within 3 BLEU of large-scale pivot-based models when it is trained on 50% of the language directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2019

Consistency by Agreement in Zero-shot Neural Machine Translation

Generalization and reliability of multilingual translation often highly ...
research
11/04/2018

Improving Zero-Shot Translation of Low-Resource Languages

Recent work on multilingual neural machine translation reported competit...
research
10/12/2020

Controllable Paraphrasing and Translation with a Syntactic Exemplar

Most prior work on exemplar-based syntactically controlled paraphrase ge...
research
05/20/2022

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

Zero-shot translation is a promising direction for building a comprehens...
research
05/26/2023

RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Attribute-controlled translation (ACT) is a subtask of machine translati...
research
06/11/2021

Zero-Shot Controlled Generation with Encoder-Decoder Transformers

Controlling neural network-based models for natural language generation ...
research
07/28/2023

Multilingual Lexical Simplification via Paraphrase Generation

Lexical simplification (LS) methods based on pretrained language models ...

Please sign up or login with your details

Forgot password? Click here to reset