Improving BERT with Self-Supervised Attention

04/08/2020
by   Xiaoyu Kou, et al.
0

One of the most popular paradigms of applying large, pre-trained NLP models such as BERT is to fine-tune it on a smaller dataset. However, one challenge remains as the fine-tuned model often overfits on smaller datasets. A symptom of this phenomenon is that irrelevant words in the sentences, even when they are obvious to humans, can substantially degrade the performance of these fine-tuned BERT models. In this paper, we propose a novel technique, called Self-Supervised Attention (SSA) to help facilitate this generalization challenge. Specifically, SSA automatically generates weak, token-level attention labels iteratively by "probing" the fine-tuned model from the previous iteration. We investigate two different ways of integrating SSA into BERT and propose a hybrid approach to combine their benefits. Empirically, on a variety of public datasets, we illustrate significant performance improvement using our SSA-enhanced BERT model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2020

When BERT Plays the Lottery, All Tickets Are Winning

Much of the recent success in NLP is due to the large Transformer-based ...
research
01/12/2022

PromptBERT: Improving BERT Sentence Embeddings with Prompts

The poor performance of the original BERT for sentence semantic similari...
research
08/21/2019

Revealing the Dark Secrets of BERT

BERT-based architectures currently give state-of-the-art performance on ...
research
12/04/2020

CUED_speech at TREC 2020 Podcast Summarisation Track

In this paper, we describe our approach for the Podcast Summarisation ch...
research
09/14/2023

Revisiting Supertagging for HPSG

We present new supertaggers trained on HPSG-based treebanks. These treeb...
research
02/19/2023

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

Recently, ChatGPT has attracted great attention, as it can generate flue...
research
08/15/2021

Maps Search Misspelling Detection Leveraging Domain-Augmented Contextual Representations

Building an independent misspelling detector and serve it before correct...

Please sign up or login with your details

Forgot password? Click here to reset