Transformer-Guided Convolutional Neural Network for Cross-View Geolocalization

04/21/2022
by   Teng Wang, et al.
6

Ground-to-aerial geolocalization refers to localizing a ground-level query image by matching it to a reference database of geo-tagged aerial imagery. This is very challenging due to the huge perspective differences in visual appearances and geometric configurations between these two views. In this work, we propose a novel Transformer-guided convolutional neural network (TransGCNN) architecture, which couples CNN-based local features with Transformer-based global representations for enhanced representation learning. Specifically, our TransGCNN consists of a CNN backbone extracting feature map from an input image and a Transformer head modeling global context from the CNN map. In particular, our Transformer head acts as a spatial-aware importance generator to select salient CNN features as the final feature representation. Such a coupling procedure allows us to leverage a lightweight Transformer network to greatly enhance the discriminative capability of the embedded features. Furthermore, we design a dual-branch Transformer head network to combine image features from multi-scale windows in order to improve details of the global feature representation. Extensive experiments on popular benchmark datasets demonstrate that our model achieves top-1 accuracy of 94.12% and 84.92% on CVUSA and CVACT_val, respectively, which outperforms the second-performing baseline with less than 50 preferable accuracy-efficiency tradeoff.

READ FULL TEXT

page 1

page 3

page 5

page 9

page 10

research
10/13/2015

Wide-Area Image Geolocalization with Aerial Reference Imagery

We propose to use deep convolutional neural networks to address the prob...
research
04/10/2023

High Dynamic Range Imaging with Context-aware Transformer

Avoiding the introduction of ghosts when synthesising LDR images as high...
research
05/09/2021

Conformer: Local Features Coupling Global Representations for Visual Recognition

Within Convolutional Neural Network (CNN), the convolution operations ar...
research
01/17/2023

Cooperation Learning Enhanced Colonic Polyp Segmentation Based on Transformer-CNN Fusion

Traditional segmentation methods for colonic polyps are mainly designed ...
research
04/11/2022

SUMD: Super U-shaped Matrix Decomposition Convolutional neural network for Image denoising

In this paper, we propose a novel and efficient CNN-based framework that...
research
08/17/2021

Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net

Existing salient object detection (SOD) methods mainly rely on CNN-based...
research
05/23/2022

SelfReformer: Self-Refined Network with Transformer for Salient Object Detection

The global and local contexts significantly contribute to the integrity ...

Please sign up or login with your details

Forgot password? Click here to reset