Truncation Sampling as Language Model Desmoothing

10/27/2022
by   John Hewitt, et al.
0

Long samples of text from neural language models can be of poor quality. Truncation sampling algorithms–like top-p or top-k – address this by setting some words' probabilities to zero at each step. This work provides framing for the aim of truncation, and an improved algorithm for that aim. We propose thinking of a neural language model as a mixture of a true distribution and a smoothing distribution that avoids infinite perplexity. In this light, truncation algorithms aim to perform desmoothing, estimating a subset of the support of the true distribution. Finding a good subset is crucial: we show that top-p unnecessarily truncates high-probability words, for example causing it to truncate all words but Trump for a document that starts with Donald. We introduce η-sampling, which truncates words below an entropy-dependent probability threshold. Compared to previous algorithms, η-sampling generates more plausible long English documents according to humans, is better at breaking out of repetition, and behaves more reasonably on a battery of test distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2020

Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm

Neural text decoding is important for generating high-quality texts usin...
research
12/01/2022

Extensible Prompts for Language Models

We propose eXtensible Prompt (X-Prompt) for prompting a large language m...
research
03/13/2021

Improving Diversity of Neural Text Generation via Inverse Probability Weighting

The neural network based text generation suffers from the text degenerat...
research
02/01/2022

Typical Decoding for Natural Language Generation

Despite achieving incredibly low perplexities on myriad natural language...
research
11/07/2022

Probing neural language models for understanding of words of estimative probability

Words of estimative probability (WEP) are expressions of a statement's p...
research
05/04/2023

Conformal Nucleus Sampling

Language models generate text based on successively sampling the next wo...
research
11/04/2016

Generalized Topic Modeling

Recently there has been significant activity in developing algorithms wi...

Please sign up or login with your details

Forgot password? Click here to reset