Towards Real-World Visual Tracking with Temporal Contexts

08/20/2023
by   Ziang Cao, et al.
0

Visual tracking has made significant improvements in the past few decades. Most existing state-of-the-art trackers 1) merely aim for performance in ideal conditions while overlooking the real-world conditions; 2) adopt the tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only integrate the temporal information into the template, where temporal contexts among consecutive frames are far from being fully utilized. To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently. Based on it, we propose a stronger version for real-world visual tracking, i.e., TCTrack++. It boils down to two levels: features and similarity maps. Specifically, for feature extraction, we propose an attention-based temporally adaptive convolution to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights. For similarity map refinement, we introduce an adaptive temporal transformer to encode the temporal knowledge efficiently and decode it for the accurate refinement of the similarity map. To further improve the performance, we additionally introduce a curriculum learning strategy. Also, we adopt online evaluation to measure performance in real-world conditions. Exhaustive experiments on 8 wellknown benchmarks demonstrate the superiority of TCTrack++. Real-world tests directly verify that TCTrack++ can be readily used in real-world applications.

READ FULL TEXT

page 3

page 9

page 11

page 12

page 15

research
03/03/2022

TCTrack: Temporal Contexts for Aerial Tracking

Temporal contexts among consecutive frames are far from being fully util...
research
11/03/2017

End-to-end Flow Correlation Tracking with Spatial-temporal Attention

Discriminative correlation filters (DCF) with deep convolutional feature...
research
03/08/2021

Predictive Visual Tracking: A New Benchmark and Baseline Approach

As a crucial robotic perception capability, visual tracking has been int...
research
09/02/2023

ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning

In recent times, learning-based methods for video deraining have demonst...
research
11/01/2022

3DMODT: Attention-Guided Affinities for Joint Detection Tracking in 3D Point Clouds

We propose a method for joint detection and tracking of multiple objects...
research
10/02/2020

Leveraging Tacit Information Embedded in CNN Layers for Visual Tracking

Different layers in CNNs provide not only different levels of abstractio...
research
01/23/2022

Basket-based Softmax

Softmax-based losses have achieved state-of-the-art performances on vari...

Please sign up or login with your details

Forgot password? Click here to reset