Prompt Tuning for Parameter-efficient Medical Image Segmentation

by   Marc Fischer, et al.

Neural networks pre-trained on a self-supervision scheme have become the standard when operating in data rich environments with scarce annotations. As such, fine-tuning a model to a downstream task in a parameter-efficient but effective way, e.g. for a new set of classes in the case of semantic segmentation, is of increasing importance. In this work, we propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. Relying on the recently popularized prompt tuning approach, we provide a prompt-able UNet (PUNet) architecture, that is frozen after pre-training, but adaptable throughout the network by class-dependent learnable prompt tokens. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes (contrastive prototype assignment, CPA) of a student teacher combination alongside a concurrent segmentation loss on a subset of classes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models on CT imaging datasets. As such, the difference between fully fine-tuned and prompt-tuned variants amounts to only 3.83 pp for the TCIA/BTCV dataset and 2.67 pp for the CT-ORG dataset in the mean Dice Similarity Coefficient (DSC, in corresponding to 0.85 parameters, are adjusted. The code for this work is available on .


page 4

page 8

page 9

page 13


Ladder Fine-tuning approach for SAM integrating complementary network

Recently, foundation models have been introduced demonstrating various t...

Transductive few-shot adapters for medical image segmentation

With the recent raise of foundation models in computer vision and NLP, t...

UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation

Vision Transformers (ViT)s have recently become popular due to their out...

RevColV2: Exploring Disentangled Representations in Masked Image Modeling

Masked image modeling (MIM) has become a prevalent pre-training setup fo...

Advancing 3D Medical Image Analysis with Variable Dimension Transform based Supervised 3D Pre-training

The difficulties in both data acquisition and annotation substantially r...

Prompt-Matched Semantic Segmentation

The objective of this work is to explore how to effectively and efficien...

Enhancing Bloodstain Analysis Through AI-Based Segmentation: Leveraging Segment Anything Model for Crime Scene Investigation

Bloodstain pattern analysis plays a crucial role in crime scene investig...

Please sign up or login with your details

Forgot password? Click here to reset