A Closer Look at How Fine-tuning Changes BERT

06/27/2021
by   Yichu Zhou, et al.
0

Given the prevalence of pre-trained contextualized representations in today's NLP, there have been several efforts to understand what information such representations contain. A common strategy to use such representations is to fine-tune them for an end task. However, how fine-tuning for a task changes the underlying space is less studied. In this work, we study the English BERT family and use two probing techniques to analyze how fine-tuning changes the space. Our experiments reveal that fine-tuning improves performance because it pushes points associated with a label away from other labels. By comparing the representations before and after fine-tuning, we also discover that fine-tuning does not change the representations arbitrarily; instead, it adjusts the representations to downstream tasks while preserving the original structure. Finally, using carefully constructed experiments, we show that fine-tuning can encode training sets in a representation, suggesting an overfitting problem of a new kind.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

What Happens To BERT Embeddings During Fine-tuning?

While there has been much recent work studying how linguistic informatio...
research
08/29/2022

Resolving inconsistencies of runtime configuration changes through change propagation and adjustments

A system configuration may be modified at runtime to adapt the system to...
research
06/10/2020

Revisiting Few-sample BERT Fine-tuning

We study the problem of few-sample fine-tuning of BERT contextual repres...
research
10/26/2022

Exploring Robustness of Prefix Tuning in Noisy Data: A Case Study in Financial Sentiment Analysis

The invention of transformer-based models such as BERT, GPT, and RoBERTa...
research
05/31/2021

On the Interplay Between Fine-tuning and Composition in Transformers

Pre-trained transformer language models have shown remarkable performanc...
research
04/26/2023

Fine Tuning with Abnormal Examples

Given the prevalence of crowd sourced labor in creating Natural Language...
research
05/15/2017

Tuning Modular Networks with Weighted Losses for Hand-Eye Coordination

This paper introduces an end-to-end fine-tuning method to improve hand-e...

Please sign up or login with your details

Forgot password? Click here to reset