Darmok and Jalad at Tanagra: A Dataset and Model for English-to-Tamarian Translation

07/16/2021
by   Peter Jansen, et al.
0

Tamarian, a fictional language introduced in the Star Trek episode Darmok, communicates meaning through utterances of metaphorical references, such as "Darmok and Jalad at Tanagra" instead of "We should work together." This work assembles a Tamarian-English dictionary of utterances from the original episode and several follow-on novels, and uses this to construct a parallel corpus of 456 English-Tamarian utterances. A machine translation system based on a large language model (T5) is trained using this parallel corpus, and is shown to produce an accuracy of 76 utterances.

READ FULL TEXT

page 1

page 2

page 3

research
01/07/2018

MIZAN: A Large Persian-English Parallel Corpus

One of the most major and essential tasks in natural language processing...
research
07/07/2020

scb-mt-en-th-2020: A Large English-Thai Parallel Corpus

The primary objective of our work is to build a large-scale English-Thai...
research
01/24/2019

Automatic Parallel Corpus Creation for Hindi-English News Translation Task

The parallel corpus for multilingual NLP tasks, deep learning applicatio...
research
06/06/2021

Itihasa: A large-scale corpus for Sanskrit to English translation

This work introduces Itihasa, a large-scale translation dataset containi...
research
03/31/2019

Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues

The recent surge of text-based online counseling applications enables us...
research
10/22/2020

Summarizing Utterances from Japanese Assembly Minutes using Political Sentence-BERT-based Method for QA Lab-PoliInfo-2 Task of NTCIR-15

There are many discussions held during political meetings, and a large n...
research
04/18/2018

Learning to Map Context-Dependent Sentences to Executable Formal Queries

We propose a context-dependent model to map utterances within an interac...

Please sign up or login with your details

Forgot password? Click here to reset