Contrastive Search Is What You Need For Neural Text Generation

10/25/2022
by   Yixuan Su, et al.
0

Generating text with autoregressive language models (LMs) is of great importance to many natural language processing (NLP) applications. Previous solutions for this task often produce text that contains degenerative expressions or lacks semantic consistency. Recently, Su et al. introduced a new decoding method, contrastive search, based on the isotropic representation space of the language model and obtained new state of the art on various benchmarks. Additionally, Su et al. argued that the representations of autoregressive LMs (e.g. GPT-2) are intrinsically anisotropic which is also shared by previous studies. Therefore, to ensure the language model follows an isotropic distribution, Su et al. proposed a contrastive learning scheme, SimCTG, which calibrates the language model's representations through additional training. In this study, we first answer the question: "Are autoregressive LMs really anisotropic?". To this end, we extensively evaluate the isotropy of LMs across 16 major languages. Surprisingly, we find that the anisotropic problem only exists in the two specific English GPT-2-small/medium models. On the other hand, all other evaluated LMs are naturally isotropic which is in contrast to the conclusion drawn by previous studies. Based on our findings, we further assess the contrastive search decoding method using off-the-shelf LMs on four generation tasks across 16 languages. Our experimental results demonstrate that contrastive search significantly outperforms previous decoding methods without any additional training. More notably, on 12 out of the 16 evaluated languages, contrastive search performs comparably with human-level performances as judged by human evaluations.

READ FULL TEXT
research
02/13/2022

A Contrastive Framework for Neural Text Generation

Text generation is of great importance to many natural language processi...
research
01/11/2021

Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning

Likelihood training and maximization-based decoding result in dull and r...
research
10/03/2022

A Non-monotonic Self-terminating Language Model

Recent large-scale neural autoregressive sequence models have shown impr...
research
09/17/2023

Contrastive Decoding Improves Reasoning in Large Language Models

We demonstrate that Contrastive Decoding – a simple, computationally lig...
research
11/19/2022

An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation

In the study, we empirically compare the two recently proposed decoding ...
research
07/04/2022

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

Several recent studies have tested the use of transformer language model...
research
10/27/2022

Contrastive Decoding: Open-ended Text Generation as Optimization

Likelihood, although useful as a training loss, is a poor search objecti...

Please sign up or login with your details

Forgot password? Click here to reset