Empowering LLM-based Machine Translation with Cultural Awareness

05/23/2023
by   Binwei Yao, et al.
0

Traditional neural machine translation (NMT) systems often fail to translate sentences that contain culturally specific information. Most previous NMT methods have incorporated external cultural knowledge during training, which requires fine-tuning on low-frequency items specific to the culture. Recent in-context learning utilizes lightweight prompts to guide large language models (LLMs) to perform machine translation, however, whether such an approach works in terms of injecting culture awareness into machine translation remains unclear. To this end, we introduce a new data curation pipeline to construct a culturally relevant parallel corpus, enriched with annotations of cultural-specific entities. Additionally, we design simple but effective prompting strategies to assist this LLM-based translation. Extensive experiments show that our approaches can largely help incorporate cultural knowledge into LLM-based machine translation, outperforming traditional NMT systems in translating cultural-specific sentences.

READ FULL TEXT

page 3

page 4

page 7

page 8

research
09/20/2023

Towards Effective Disambiguation for Machine Translation with Large Language Models

Resolving semantic ambiguity has long been recognised as a central chall...
research
03/31/2023

ℰ KÚ [MASK]: Integrating Yorùbá cultural greetings into machine translation

This paper investigates the performance of massively multilingual neural...
research
08/07/2017

Memory-augmented Neural Machine Translation

Neural machine translation (NMT) has achieved notable success in recent ...
research
10/17/2020

A Corpus for English-Japanese Multimodal Neural Machine Translation with Comparable Sentences

Multimodal neural machine translation (NMT) has become an increasingly i...
research
11/25/2022

Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?

Neural machine translation (NMT) is often criticized for failures that h...
research
09/07/2017

Translating Domain-Specific Expressions in Knowledge Bases with Neural Machine Translation

Our work presented in this paper focuses on the translation of domain-sp...
research
06/02/2023

Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT

Subword tokenization is the de facto standard for tokenization in neural...

Please sign up or login with your details

Forgot password? Click here to reset