Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations

10/12/2020
by   Nikolaos Manginas, et al.
0

Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues. To this end, we propose o novel approach to fine-tune BERT in a structured manner. Specifically, we focus on Large Scale Multilabel Text Classification (LMTC) where documents are assigned with one or more labels from a large predefined set of hierarchically organized labels. Our approach guides specific BERT layers to predict labels from specific hierarchy levels. Experimenting with two LMTC datasets we show that this structured fine-tuning approach not only yields better classification results but also leads to better parameter utilization.

READ FULL TEXT

page 3

page 4

page 8

page 9

research
04/17/2019

DocBERT: BERT for Document Classification

Pre-trained language representation models achieve remarkable state of t...
research
11/22/2021

Finding the Winning Ticket of BERT for Binary Text Classification via Adaptive Layer Truncation before Fine-tuning

In light of the success of transferring language models into NLP tasks, ...
research
02/20/2020

Federated pretraining and fine tuning of BERT using clinical notes from multiple silos

Large scale contextual representation models, such as BERT, have signifi...
research
11/12/2020

An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text

Unstructured clinical text in EHRs contains crucial information for appl...
research
04/09/2022

FoundationLayerNorm: Scaling BERT and GPT to 1,000 Layers

The mainstream BERT/GPT model contains only 10 to 20 layers, and there i...
research
10/10/2022

Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

Ensembling BERT models often significantly improves accuracy, but at the...
research
11/04/2020

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Previous studies investigating the syntactic abilities of deep learning ...

Please sign up or login with your details

Forgot password? Click here to reset