Holistic Sentence Embeddings for Better Out-of-Distribution Detection

10/14/2022
by   Sishuo Chen, et al.
9

Detecting out-of-distribution (OOD) instances is significant for the safe deployment of NLP models. Among recent textual OOD detection works based on pretrained language models (PLMs), distance-based methods have shown superior performance. However, they estimate sample distance scores in the last-layer CLS embedding space and thus do not make full use of linguistic information underlying in PLMs. To address the issue, we propose to boost OOD detection by deriving more holistic sentence embeddings. On the basis of the observations that token averaging and layer combination contribute to improving OOD detection, we propose a simple embedding approach named Avg-Avg, which averages all token representations from each intermediate layer as the sentence embedding and significantly surpasses the state-of-the-art on a comprehensive suite of benchmarks by a 9.33 demonstrates that it indeed helps preserve general linguistic knowledge in fine-tuned PLMs and substantially benefits detecting background shifts. The simple yet effective embedding method can be applied to fine-tuned PLMs with negligible extra costs, providing a free gain in OOD detection. Our code is available at https://github.com/lancopku/Avg-Avg.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2023

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

Detecting out-of-distribution (OOD) inputs is crucial for the safe deplo...
research
01/12/2022

PromptBERT: Improving BERT Sentence Embeddings with Prompts

The poor performance of the original BERT for sentence semantic similari...
research
04/16/2022

Unsupervised Attention-based Sentence-Level Meta-Embeddings from Contextualised Language Models

A variety of contextualised language models have been proposed in the NL...
research
09/15/2021

Can Edge Probing Tasks Reveal Linguistic Knowledge in QA Models?

There have been many efforts to try to understand what gram-matical know...
research
12/10/2022

Position Embedding Needs an Independent Layer Normalization

The Position Embedding (PE) is critical for Vision Transformers (VTs) du...
research
09/06/2022

Analyzing Transformers in Embedding Space

Understanding Transformer-based models has attracted significant attenti...
research
10/01/2021

On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Detecting out-of-distribution (OOD) data has become a critical component...

Please sign up or login with your details

Forgot password? Click here to reset