Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise?

06/27/2023
by   Xinzhe Li, et al.
0

For Pretrained Language Models (PLMs), their susceptibility to noise has recently been linked to subword segmentation. However, it is unclear which aspects of segmentation affect their understanding. This study assesses the robustness of PLMs against various disrupted segmentation caused by noise. An evaluation framework for subword segmentation, named Contrastive Lexical Semantic (CoLeS) probe, is proposed. It provides a systematic categorization of segmentation corruption under noise and evaluation protocols by generating contrastive datasets with canonical-noisy word pairs. Experimental results indicate that PLMs are unable to accurately compute word meanings if the noise introduces completely different subwords, small subword fragments, or a large number of additional subwords, particularly when they are inserted within other subwords.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2020

Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines

We use pretrained transformer-based language models in SemEval-2020 Task...
research
10/12/2020

Probing Pretrained Language Models for Lexical Semantics

The success of large pretrained language models (LMs) such as BERT and R...
research
03/21/2022

Word Order Does Matter (And Shuffled Language Models Know It)

Recent studies have shown that language models pretrained and/or fine-tu...
research
05/14/2023

ParaLS: Lexical Substitution via Pretrained Paraphraser

Lexical substitution (LS) aims at finding appropriate substitutes for a ...
research
05/26/2023

Three Towers: Flexible Contrastive Learning with Pretrained Image Models

We introduce Three Towers (3T), a flexible method to improve the contras...
research
10/29/2021

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

We investigate the performance on phoneme categorization and phoneme and...
research
08/17/2020

Lazy caterer jigsaw puzzles: Models, properties, and a mechanical system-based solver

Jigsaw puzzle solving, the problem of constructing a coherent whole from...

Please sign up or login with your details

Forgot password? Click here to reset