On the Effectiveness of Compact Biomedical Transformers

09/07/2022
by   Omid Rohanian, et al.
9

Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension, and number of layers. The natural language processing (NLP) community has developed numerous strategies to compress these models utilising techniques such as pruning, quantisation, and knowledge distillation, resulting in models that are considerably faster, smaller, and subsequently easier to use in practice. By the same token, in this paper we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT, and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset via the Masked Language Modelling (MLM) objective. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create efficient lightweight models that perform on par with their larger counterparts. All the models will be publicly available on our Huggingface profile at https://huggingface.co/nlpie and the codes used to run the experiments will be available at https://github.com/nlpie-research/Compact-Biomedical-Transformers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2023

Lightweight Transformers for Clinical Natural Language Processing

Specialised pre-trained language models are becoming more frequent in NL...
research
10/12/2022

MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers

Pre-trained Language Models (LMs) have become an integral part of Natura...
research
02/03/2023

Bioformer: an efficient transformer language model for biomedical text mining

Pretrained language models such as Bidirectional Encoder Representations...
research
02/04/2022

Transformers and the representation of biomedical background knowledge

BioBERT and BioMegatron are Transformers models adapted for the biomedic...
research
10/11/2021

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Pre-trained language models (PLMs) have been the de facto paradigm for m...
research
09/15/2021

Can Language Models be Biomedical Knowledge Bases?

Pre-trained language models (LMs) have become ubiquitous in solving vari...
research
10/14/2021

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

Pre-trained language models (PLMs), such as BERT and GPT, have revolutio...

Please sign up or login with your details

Forgot password? Click here to reset