LERT: A Linguistically-motivated Pre-trained Language Model

11/10/2022
by   Yiming Cui, et al.
0

Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked language model (MLM). To further empower the PLMs with richer linguistic features, in this paper, we aim to propose a simple but effective way to learn linguistic features for pre-trained language models. We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original MLM pre-training task, using a linguistically-informed pre-training (LIP) strategy. We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements over various comparable baselines. Furthermore, we also conduct analytical experiments in various linguistic aspects, and the results prove that the design of LERT is valid and effective. Resources are available at https://github.com/ymcui/LERT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2022

PERT: Pre-training BERT with Permuted Language Model

Pre-trained Language Models (PLMs) have been widely used in various natu...
research
06/13/2022

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

This paper aims to advance the mathematical intelligence of machines by ...
research
08/05/2022

Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

This report describes a pre-trained language model Erlangshen with prope...
research
06/05/2023

Benchmarking Middle-Trained Language Models for Neural Search

Middle training methods aim to bridge the gap between the Masked Languag...
research
07/04/2022

Probing via Prompting

Probing is a popular method to discern what linguistic information is co...
research
09/19/2023

OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch

Large language models (LLMs) with billions of parameters have demonstrat...
research
04/20/2018

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Many efforts have been made to facilitate natural language processing ta...

Please sign up or login with your details

Forgot password? Click here to reset