Conditional BERT Contextual Augmentation

12/17/2018
by   Xing Wu, et al.
0

We propose a novel data augmentation method for labeled sentences called conditional BERT contextual augmentation. Data augmentation methods are often applied to prevent overfitting and improve generalization of deep neural network models. Recently proposed contextual augmentation augments labeled sentences by randomly replacing words with more varied substitutions predicted by language model. BERT demonstrates that a deep bidirectional language model is more powerful than either an unidirectional language model or the shallow concatenation of a forward and backward model. We retrofit BERT to conditional BERT by introducing a new conditional masked language model[The term "conditional masked language model" appeared once in original BERT paper, which indicates context-conditional, is equivalent to term "masked language model". In our paper, "conditional masked language model" indicates we apply extra label-conditional constraint to the "masked language model".] task. The well trained conditional BERT can be applied to enhance contextual augmentation. Experiments on six various different text classification tasks show that our method can be easily applied to both convolutional or recurrent neural networks classifier to obtain obvious improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2018

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations

We propose a novel data augmentation for labeled sentences called contex...
research
11/08/2019

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text ...
research
02/11/2019

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

We show that BERT (Devlin et al., 2018) is a Markov random field languag...
research
04/17/2020

Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning

Even though BERT achieves successful performance improvements in various...
research
10/29/2020

Contextual BERT: Conditioning the Language Model Using a Global State

BERT is a popular language model whose main pre-training task is to fill...
research
09/25/2020

A little goes a long way: Improving toxic language classification despite data scarcity

Detection of some types of toxic language is hampered by extreme scarcit...
research
08/27/2021

Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors

In this paper, we explore the capacity of a language model-based method ...

Please sign up or login with your details

Forgot password? Click here to reset