Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks

10/24/2022
by   Vikas Raunak, et al.
0

Memorization presents a challenge for several constrained Natural Language Generation (NLG) tasks such as Neural Machine Translation (NMT), wherein the proclivity of neural models to memorize noisy and atypical samples reacts adversely with the noisy (web crawled) datasets. However, previous studies of memorization in constrained NLG tasks have only focused on counterfactual memorization, linking it to the problem of hallucinations. In this work, we propose a new, inexpensive algorithm for extractive memorization (exact training data generation under insufficient context) in constrained sequence generation tasks and use it to study extractive memorization and its effects in NMT. We demonstrate that extractive memorization poses a serious threat to NMT reliability by qualitatively and quantitatively characterizing the memorized samples as well as the model behavior in their vicinity. Based on empirical observations, we develop a simple algorithm which elicits non-memorized translations of memorized samples from the same model, for a large fraction of such samples. Finally, we show that the proposed algorithm could also be leveraged to mitigate memorization in the model through finetuning. We have released the code to reproduce our results at https://github.com/vyraun/Finding-Memo.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2019

Depth Growing for Neural Machine Translation

While very deep neural networks have shown effectiveness for computer vi...
research
02/17/2020

Incorporating BERT into Neural Machine Translation

The recently proposed BERT has shown great power on a variety of natural...
research
08/13/2018

Regularizing Neural Machine Translation by Target-bidirectional Agreement

Although Neural Machine Translation (NMT) has achieved remarkable progre...
research
05/26/2023

Songs Across Borders: Singable and Controllable Neural Lyric Translation

The development of general-domain neural machine translation (NMT) metho...
research
07/17/2021

On the Copying Behaviors of Pre-Training for Neural Machine Translation

Previous studies have shown that initializing neural machine translation...
research
10/29/2019

Findings of the Third Workshop on Neural Generation and Translation

This document describes the findings of the Third Workshop on Neural Gen...
research
05/25/2023

Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias

While existing work on studying bias in NLP focues on negative or pejora...

Please sign up or login with your details

Forgot password? Click here to reset