Resource-efficient domain adaptive pre-training for medical images

by   Yasar Mehmood, et al.

The deep learning-based analysis of medical images suffers from data scarcity because of high annotation costs and privacy concerns. Researchers in this domain have used transfer learning to avoid overfitting when using complex architectures. However, the domain differences between pre-training and downstream data hamper the performance of the downstream task. Some recent studies have successfully used domain-adaptive pre-training (DAPT) to address this issue. In DAPT, models are initialized with the generic dataset pre-trained weights, and further pre-training is performed using a moderately sized in-domain dataset (medical images). Although this technique achieved good results for the downstream tasks in terms of accuracy and robustness, it is computationally expensive even when the datasets for DAPT are moderately sized. These compute-intensive techniques and models impact the environment negatively and create an uneven playing field for researchers with limited resources. This study proposed computationally efficient DAPT without compromising the downstream accuracy and robustness. This study proposes three techniques for this purpose, where the first (partial DAPT) performs DAPT on a subset of layers. The second one adopts a hybrid strategy (hybrid DAPT) by performing partial DAPT for a few epochs and then full DAPT for the remaining epochs. The third technique performs DAPT on simplified variants of the base architecture. The results showed that compared to the standard DAPT (full DAPT), the hybrid DAPT technique achieved better performance on the development and external datasets. In contrast, simplified architectures (after DAPT) achieved the best robustness while achieving modest performance on the development dataset .


page 9

page 10

page 11

page 23


Multi-stage Pre-training over Simplified Multimodal Pre-training Models

Multimodal pre-training models, such as LXMERT, have achieved excellent ...

Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data

Pre-training models on Imagenet or other massive datasets of real images...

Towards Simple and Efficient Task-Adaptive Pre-training for Text Classification

Language models are pre-trained using large corpora of generic data like...

ELLE: Efficient Lifelong Pre-training for Emerging Data

Current pre-trained language models (PLM) are typically trained with sta...

FDAPT: Federated Domain-adaptive Pre-training for Language Models

Combining Domain-adaptive Pre-training (DAPT) with Federated Learning (F...

Effect of large-scale pre-training on full and few-shot transfer learning for natural and medical images

Transfer learning aims to exploit pre-trained models for more efficient ...

Foundational Models for Continual Learning: An Empirical Study of Latent Replay

Rapid development of large-scale pre-training has resulted in foundation...

Please sign up or login with your details

Forgot password? Click here to reset