Visual Question Answering in Remote Sensing with Cross-Attention and Multimodal Information Bottleneck

06/25/2023
by   Jayesh Songara, et al.
0

In this research, we deal with the problem of visual question answering (VQA) in remote sensing. While remotely sensed images contain information significant for the task of identification and object detection, they pose a great challenge in their processing because of high dimensionality, volume and redundancy. Furthermore, processing image information jointly with language features adds additional constraints, such as mapping the corresponding image and language features. To handle this problem, we propose a cross attention based approach combined with information maximization. The CNN-LSTM based cross-attention highlights the information in the image and language modalities and establishes a connection between the two, while information maximization learns a low dimensional bottleneck layer, that has all the relevant information required to carry out the VQA task. We evaluate our method on two VQA remote sensing datasets of different resolutions. For the high resolution dataset, we achieve an overall accuracy of 79.11 sets while for the low resolution dataset, we achieve an overall accuracy of 85.98

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/24/2021

How to find a good image-text embedding for remote sensing visual question answering?

Visual question answering (VQA) has recently been introduced to remote s...
research
05/06/2022

From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data

Visual question answering (VQA) for remote sensing scene has great poten...
research
04/07/2023

Multilingual Augmentation for Robust Visual Question Answering in Remote Sensing Images

Aiming at answering questions based on the content of remotely sensed im...
research
03/16/2020

RSVQA: Visual Question Answering for Remote Sensing Data

This paper introduces the task of visual question answering for remote s...
research
06/01/2023

Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

The Visual Question Answering (VQA) system offers a user-friendly interf...
research
03/13/2023

Polar-VQA: Visual Question Answering on Remote Sensed Ice sheet Imagery from Polar Region

For glaciologists, studying ice sheets from the polar regions is critica...
research
05/24/2023

Remote Sensing Image Change Detection Towards Continuous Bitemporal Resolution Differences

Most contemporary supervised Remote Sensing (RS) image Change Detection ...

Please sign up or login with your details

Forgot password? Click here to reset