Aerial Image Object Detection With Vision Transformer Detector (ViTDet)

01/28/2023
by   Liya Wang, et al.
0

The past few years have seen an increased interest in aerial image object detection due to its critical value to large-scale geo-scientific research like environmental studies, urban planning, and intelligence monitoring. However, the task is very challenging due to the birds-eye view perspective, complex backgrounds, large and various image sizes, different appearances of objects, and the scarcity of well-annotated datasets. Recent advances in computer vision have shown promise tackling the challenge. Specifically, Vision Transformer Detector (ViTDet) was proposed to extract multi-scale features for object detection. The empirical study shows that ViTDet's simple design achieves good performance on natural scene images and can be easily embedded into any detector architecture. To date, ViTDet's potential benefit to challenging aerial image object detection has not been explored. Therefore, in our study, 25 experiments were carried out to evaluate the effectiveness of ViTDet for aerial image object detection on three well-known datasets: Airbus Aircraft, RarePlanes, and Dataset of Object DeTection in Aerial images (DOTA). Our results show that ViTDet can consistently outperform its convolutional neural network counterparts on horizontal bounding box (HBB) object detection by a large margin (up to 17 competitive performance for oriented bounding box (OBB) object detection. Our results also establish a baseline for future research.

READ FULL TEXT

page 2

page 9

page 10

page 13

page 15

page 16

page 17

page 19

research
11/28/2017

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Object detection is an important and challenging problem in computer vis...
research
12/01/2018

Learning RoI Transformer for Detecting Oriented Objects in Aerial Images

Object detection in aerial images is an active yet challenging task in c...
research
03/17/2015

3D Object Class Detection in the Wild

Object class detection has been a synonym for 2D bounding box localizati...
research
03/24/2022

Focus-and-Detect: A Small Object Detection Framework for Aerial Images

Despite recent advances, object detection in aerial images is still a ch...
research
09/18/2023

NOMAD: A Natural, Occluded, Multi-scale Aerial Dataset, for Emergency Response Scenarios

With the increasing reliance on small Unmanned Aerial Systems (sUAS) for...
research
03/17/2020

Revisiting the Sibling Head in Object Detector

The “shared head for classification and localization” (sibling head), fi...
research
07/07/2018

Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery

Automatic multi-class object detection in remote sensing images in uncon...

Please sign up or login with your details

Forgot password? Click here to reset