Language modeling via stochastic processes

03/21/2022
by   Rose E. Wang, et al.
0

Modern language models can generate high-quality short texts. However, they often meander or are incoherent when generating longer texts. These issues arise from the next-token-only language modeling objective. To address these issues, we introduce Time Control (TC), a language model that implicitly plans via a latent stochastic process. TC does this by learning a representation which maps the dynamics of how text changes in a document to the dynamics of a stochastic process of interest. Using this representation, the language model can generate text by first implicitly generating a document plan via a stochastic process, and then generating text that is consistent with this latent plan. Compared to domain-specific methods and fine-tuning GPT2 across a variety of text domains, TC improves performance on text infilling and discourse coherence. On long text generation settings, TC preserves the text structure both in terms of ordering (up to +40 consistency (up to +17 more than the baselines.

READ FULL TEXT

page 30

page 31

page 32

page 33

research
04/14/2023

Stochastic Code Generation

Large language models pre-trained for code generation can generate high-...
research
10/12/2021

DiscoDVT: Generating Long Text with Discourse-Aware Discrete Variational Transformer

Despite the recent advances in applying pre-trained language models to g...
research
06/01/2019

Adversarial Generation and Encoding of Nested Texts

In this paper we propose a new language model called AGENT, which stands...
research
05/11/2020

Enabling Language Models to Fill in the Blanks

We present a simple approach for text infilling, the task of predicting ...
research
04/08/2020

Generating Narrative Text in a Switching Dynamical System

Early work on narrative modeling used explicit plans and goals to genera...
research
01/16/2013

Probabilistic State-Dependent Grammars for Plan Recognition

Techniques for plan recognition under uncertainty require a stochastic m...
research
03/14/2023

Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers

Long-sequence transformers are designed to improve the representation of...

Please sign up or login with your details

Forgot password? Click here to reset