Remote Sensing Change Detection With Transformers Trained from Scratch

04/13/2023
by   Mubashir Noman, et al.
5

Current transformer-based change detection (CD) approaches either employ a pre-trained model trained on large-scale image classification ImageNet dataset or rely on first pre-training on another CD dataset and then fine-tuning on the target benchmark. This current strategy is driven by the fact that transformers typically require a large amount of training data to learn inductive biases, which is insufficient in standard CD datasets due to their small size. We develop an end-to-end CD approach with transformers that is trained from scratch and yet achieves state-of-the-art performance on four public benchmarks. Instead of using conventional self-attention that struggles to capture inductive biases when trained from scratch, our architecture utilizes a shuffled sparse-attention operation that focuses on selected sparse informative regions to capture the inherent characteristics of the CD data. Moreover, we introduce a change-enhanced feature fusion (CEFF) module to fuse the features from input image pairs by performing a per-channel re-weighting. Our CEFF module aids in enhancing the relevant semantic changes while suppressing the noisy ones. Extensive experiments on four CD datasets reveal the merits of the proposed contributions, achieving gains as high as 14.27% in intersection-over-union (IoU) score, compared to the best-published results in the literature. Code is available at <https://github.com/mustansarfiaz/ScratchFormer>.

READ FULL TEXT

page 2

page 4

page 7

research
10/13/2022

How to Train Vision Transformer on Small-scale Datasets?

Vision Transformer (ViT), a radically different architecture than convol...
research
11/05/2021

The Role of Pre-Training in High-Resolution Remote Sensing Scene Classification

Due to the scarcity of labeled data, using models pre-trained on ImageNe...
research
10/12/2022

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets

There still remains an extreme performance gap between Vision Transforme...
research
06/15/2022

SP-ViT: Learning 2D Spatial Priors for Vision Transformers

Recently, transformers have shown great potential in image classificatio...
research
07/26/2023

AViT: Adapting Vision Transformers for Small Skin Lesion Segmentation Datasets

Skin lesion segmentation (SLS) plays an important role in skin lesion an...
research
02/17/2022

Graph Masked Autoencoder

Transformers have achieved state-of-the-art performance in learning grap...
research
03/02/2021

Fast Adaptation with Linearized Neural Networks

The inductive biases of trained neural networks are difficult to underst...

Please sign up or login with your details

Forgot password? Click here to reset