Detoxifying Language Models with a Toxic Corpus

04/30/2022
by   Yoon A Park, et al.
0

Existing studies have investigated the tendency of autoregressive language models to generate contexts that exhibit undesired biases and toxicity. Various debiasing approaches have been proposed, which are primarily categorized into data-based and decoding-based. In our study, we investigate the ensemble of the two debiasing paradigms, proposing to use toxic corpus as an additional resource to reduce the toxicity. Our result shows that toxic corpus can indeed help to reduce the toxicity of the language generation process substantially, complementing the existing debiasing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2022

Speciesist Language and Nonhuman Animal Bias in English Masked Language Models

Various existing studies have analyzed what social biases are inherited ...
research
03/06/2022

Leashing the Inner Demons: Self-Detoxification for Language Models

Language models (LMs) can reproduce (or amplify) toxic language seen dur...
research
06/03/2023

Guided scenarios with simulated expert personae: a remarkable strategy to perform cognitive work

Large language models (LLMs) trained on a substantial corpus of human kn...
research
02/08/2022

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Pre-trained language models (LMs) are shown to easily generate toxic lan...
research
09/16/2021

Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models

We introduce CoWeSe (the Corpus Web Salud Español), the largest Spanish ...
research
02/28/2021

Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP

When trained on large, unfiltered crawls from the internet, language mod...
research
10/31/2022

A Simple, Yet Effective Approach to Finding Biases in Code Generation

Recently, scores of high-performing code generation systems have surface...

Please sign up or login with your details

Forgot password? Click here to reset