GENIE: Large Scale Pre-training for Text Generation with Diffusion Model

by   Zhenghao Lin, et al.

In this paper, we propose a large-scale language pre-training for text GENeration using dIffusion modEl, which is named GENIE. GENIE is a pre-training sequence-to-sequence text generation model which combines Transformer and diffusion. The diffusion model accepts the latent information from the encoder, which is used to guide the denoising of the current time step. After multiple such denoise iterations, the diffusion model can restore the Gaussian noise to the diverse output text which is controlled by the input text. Moreover, such architecture design also allows us to adopt large scale pre-training on the GENIE. We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model. Extensive experiments on the XSum, CNN/DailyMail, and Gigaword benchmarks shows that GENIE can achieves comparable performance with various strong baselines, especially after pre-training, the generation quality of GENIE is greatly improved. We have also conduct a lot of experiments on the generation diversity and parameter impact of GENIE. The code for GENIE will be made publicly available.


page 1

page 2

page 3

page 4


Denoising based Sequence-to-Sequence Pre-training for Text Generation

This paper presents a new sequence-to-sequence (seq2seq) pre-training me...

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

Conventional methods for the image-text generation tasks mainly tackle t...

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

In this paper, we present a new sequence-to-sequence pre-training model ...

Diffusion Models as Masked Autoencoders

There has been a longstanding belief that generation can facilitate a tr...

Can Diffusion Model Achieve Better Performance in Text Generation? Bridging the Gap between Training and Inference!

Diffusion models have been successfully adapted to text generation tasks...

Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers

The task of verbalization of RDF triples has known a growth in popularit...

Easy and Efficient Transformer : Scalable Inference Solution For large NLP mode

The ultra-large-scale pre-training model can effectively improve the eff...

Please sign up or login with your details

Forgot password? Click here to reset