Democratized Diffusion Language Model

05/18/2023
by   Nikita Balagansky, et al.
0

Despite the potential benefits of Diffusion Models for NLP applications, publicly available implementations, trained models, or reproducible training procedures currently need to be publicly available. We present the Democratized Diffusion Language Model (DDLM), based on the Continuous Diffusion for Categorical Data (CDCD) framework, to address these challenges. We propose a simplified training procedure for DDLM using the C4 dataset and perform an in-depth analysis of the trained model's behavior. Furthermore, we introduce a novel early-exiting strategy for faster sampling with models trained with score interpolation. Since no previous works aimed at solving downstream tasks with pre-trained Diffusion LM (e.g., classification tasks), we experimented with GLUE Benchmark to study the ability of DDLM to transfer knowledge. With this paper, we propose available training and evaluation pipelines to other researchers and pre-trained DDLM models, which could be used in future research with Diffusion LMs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2022

TiBERT: Tibetan Pre-trained Language Model

The pre-trained language model is trained on large-scale unlabeled text ...
research
09/25/2021

DziriBERT: a Pre-trained Language Model for the Algerian Dialect

Pre-trained transformers are now the de facto models in Natural Language...
research
11/02/2020

Introducing various Semantic Models for Amharic: Experimentation and Evaluation with multiple Tasks and Datasets

The availability of different pre-trained semantic models enabled the qu...
research
06/20/2023

Masked Diffusion Models are Fast Learners

Diffusion models have emerged as the de-facto technique for image genera...
research
05/29/2023

Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

Due to the ease of training, ability to scale, and high sample quality, ...
research
04/25/2023

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

Diffusion models are powerful, but they require a lot of time and data t...
research
06/08/2023

Privacy- and Utility-Preserving NLP with Anonymized Data: A case study of Pseudonymization

This work investigates the effectiveness of different pseudonymization t...

Please sign up or login with your details

Forgot password? Click here to reset