Temporal Aggregation for Adaptive RGBT Tracking

by   Zhangyong Tang, et al.

Visual object tracking with RGB and thermal infrared (TIR) spectra available, shorted in RGBT tracking, is a novel and challenging research topic which draws increasing attention nowadays. In this paper, we propose an RGBT tracker which takes spatio-temporal clues into account for robust appearance model learning, and simultaneously, constructs an adaptive fusion sub-network for cross-modal interactions. Unlike most existing RGBT trackers that implement object tracking tasks with only spatial information included, temporal information is further considered in this method. Specifically, different from traditional Siamese trackers, which only obtain one search image during the process of picking up template-search image pairs, an extra search sample adjacent to the original one is selected to predict the temporal transformation, resulting in improved robustness of tracking performance.As for multi-modal tracking, constrained to the limited RGBT datasets, the adaptive fusion sub-network is appended to our method at the decision level to reflect the complementary characteristics contained in two modalities. To design a thermal infrared assisted RGB tracker, the outputs of the classification head from the TIR modality are taken into consideration before the residual connection from the RGB modality. Extensive experimental results on three challenging datasets, i.e. VOT-RGBT2019, GTOT and RGBT210, verify the effectiveness of our method. Code will be shared at https://github.com/Zhangyong-Tang/TAAT.


page 1

page 8

page 9

page 11


Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking

We address the problem of multi-modal object tracking in video and explo...

Learning Target-oriented Dual Attention for Robust RGB-T Tracking

RGB-Thermal object tracking attempt to locate target object using comple...

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

In this study, we propose a novel RGB-T tracking framework by jointly mo...

Unveiling the Power of Deep Tracking

In the field of generic object tracking numerous attempts have been made...

Multi-modal Visual Tracking: Review and Experimental Comparison

Visual object tracking, as a fundamental task in computer vision, has dr...

EANet: Enhanced Attribute-based RGBT Tracker Network

Tracking objects can be a difficult task in computer vision, especially ...

DASTSiam: Spatio-Temporal Fusion and Discriminative Augmentation for Improved Siamese Tracking

Tracking tasks based on deep neural networks have greatly improved with ...

Please sign up or login with your details

Forgot password? Click here to reset