Improved Pretraining for Domain-specific Contextual Embedding Models

by   Subendhu Rongali, et al.
University of Massachusetts Amherst

We investigate methods to mitigate catastrophic forgetting during domain-specific pretraining of contextual embedding models such as BERT, DistilBERT, and RoBERTa. Recently proposed domain-specific models such as BioBERT, SciBERT and ClinicalBERT are constructed by continuing the pretraining phase on a domain-specific text corpus. Such pretraining is susceptible to catastrophic forgetting, where the model forgets some of the information learned in the general domain. We propose the use of two continual learning techniques (rehearsal and elastic weight consolidation) to improve domain-specific training. Our results show that models trained by our proposed approaches can better maintain their performance on the general domain tasks, and at the same time, outperform domain-specific baseline models on downstream domain tasks.


page 1

page 2

page 3

page 4


CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

Continual pretraining is a standard way of building a domain-specific pr...

MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model

Domain adaptive pretraining, i.e. the continued unsupervised pretraining...

General and Domain Adaptive Chinese Spelling Check with Error Consistent Pretraining

The lack of label data is one of the significant bottlenecks for Chinese...

Domain-specific ChatBots for Science using Embeddings

Large language models (LLMs) have emerged as powerful machine-learning s...

Continual Domain Adaptation through Pruning-aided Domain-specific Weight Modulation

In this paper, we propose to develop a method to address unsupervised do...

Whether and When does Endoscopy Domain Pretraining Make Sense?

Automated endoscopy video analysis is a challenging task in medical comp...

When Prompt-based Incremental Learning Does Not Meet Strong Pretraining

Incremental learning aims to overcome catastrophic forgetting when learn...

Please sign up or login with your details

Forgot password? Click here to reset