Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation

09/10/2020
by   Junlong Li, et al.
0

Pre-trained Language Models (PrLMs) have been widely used as backbones in lots of Natural Language Processing (NLP) tasks. The common process of utilizing PrLMs is first pre-training on large-scale general corpora with task-independent LM training objectives, then fine-tuning on task datasets with task-specific training objectives. Pre-training in a task-independent way enables the models to learn language representations, which is universal to some extent, but fails to capture crucial task-specific features in the meantime. This will lead to an incompatibility between pre-training and fine-tuning. To address this issue, we introduce task-specific pre-training on in-domain task-related corpora with task-specific objectives. This procedure is placed between the original two stages to enhance the model understanding capacity of specific tasks. In this work, we focus on Dialogue-related Natural Language Processing (DrNLP) tasks and design a Dialogue-Adaptive Pre-training Objective (DAPO) based on some important qualities for assessing dialogues which are usually ignored by general LM pre-training objectives. PrLMs with DAPO on a large in-domain dialogue corpus are then fine-tuned for downstream DrNLP tasks. Experimental results show that models with DAPO surpass those with general LM pre-training objectives and other strong baselines on downstream DrNLP tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2023

SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

The pre-training and fine-tuning paradigm has contributed to a number of...
research
04/20/2020

Adversarial Training for Large Neural Language Models

Generalization and robustness are both key desiderata for designing mach...
research
02/08/2023

Automating Code-Related Tasks Through Transformers: The Impact of Pre-training

Transformers have gained popularity in the software engineering (SE) lit...
research
05/23/2021

Structural Pre-training for Dialogue Comprehension

Pre-trained language models (PrLMs) have demonstrated superior performan...
research
02/24/2023

MUX-PLMs: Pre-training Language Models with Data Multiplexing

Data multiplexing is a recently proposed method for improving a model's ...
research
09/15/2021

Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Pre-training, where models are trained on an auxiliary objective with ab...
research
10/07/2021

UoB at SemEval-2021 Task 5: Extending Pre-Trained Language Models to Include Task and Domain-Specific Information for Toxic Span Prediction

Toxicity is pervasive in social media and poses a major threat to the he...

Please sign up or login with your details

Forgot password? Click here to reset