Can Unconditional Language Models Recover Arbitrary Sentences?

07/10/2019
by   Nishant Subramani, et al.
0

Neural network-based generative language models like ELMo and BERT can work effectively as general purpose sentence encoders in text classification without further fine-tuning. Is it possible to adapt them in a similar way for use as general-purpose decoders? For this to be possible, it would need to be the case that for any target sentence of interest, there is some continuous representation that can be passed to the language model to cause it to reproduce that sentence. We set aside the difficult problem of designing an encoder that can produce such representations and instead ask directly whether such representations exist at all. To do this, we introduce a pair of effective complementary methods for feeding representations into pretrained unconditional language models and a corresponding set of methods to map sentences into and out of this representation space, the reparametrized sentence space. We then investigate the conditions under which a language model can be made to generate a sentence through the identification of a point in such a space and find that it is possible to recover arbitrary sentences nearly perfectly with language models and representations of moderate size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2020

Discovering Useful Sentence Representations from Large Pretrained Language Models

Despite the extensive success of pretrained language models as encoders ...
research
05/10/2022

Extracting Latent Steering Vectors from Pretrained Language Models

Prior work on controllable text generation has focused on learning how t...
research
04/25/2023

Compressing Sentence Representation with maximum Coding Rate Reduction

In most natural language inference problems, sentence representation is ...
research
09/09/2019

Pretrained Language Models for Sequential Sentence Classification

As a step toward better document-level understanding, we explore classif...
research
04/16/2021

Is Your Language Model Ready for Dense Representation Fine-tuning?

Pre-trained language models (LM) have become go-to text representation e...
research
11/10/2014

Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models

Inspired by recent advances in multimodal learning and machine translati...
research
05/04/2023

Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence

Sentence-level representations are beneficial for various natural langua...

Please sign up or login with your details

Forgot password? Click here to reset