Do Different Tracking Tasks Require Different Appearance Models?

07/05/2021
by   Zhongdao Wang, et al.
University of Oxford
Tsinghua University
FiveAI Inc
0

Tracking objects of interest in a video is one of the most popular and widely applicable problems in computer vision. However, with the years, a Cambrian explosion of use cases and benchmarks has fragmented the problem in a multitude of different experimental setups. As a consequence, the literature has fragmented too, and now the novel approaches proposed by the community are usually specialised to fit only one specific setup. To understand to what extent this specialisation is actually necessary, in this work we present UniTrack, a unified tracking solution to address five different tasks within the same framework. UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion, and multiple "heads" to address individual tasks and that do not require training. We show how most tracking tasks can be solved within this framework, and that the same appearance model can be used to obtain performance that is competitive against specialised methods for all the five tasks considered. The framework also allows us to analyse appearance models obtained with the most recent self-supervised methods, thus significantly extending their evaluation and comparison to a larger variety of important problems. Code available at https://github.com/Zhongdao/UniTrack.

READ FULL TEXT

page 3

page 8

page 17

03/29/2022

Unified Transformer Tracker for Object Tracking

As an important area in computer vision, object tracking has formed two ...
02/18/2020

MAST: A Memory-Augmented Self-supervised Tracker

Recent interest in self-supervised dense tracking has yielded rapid prog...
06/20/2022

Visualizing and Understanding Self-Supervised Vision Learning

Self-Supervised vision learning has revolutionized deep learning, becomi...
06/22/2020

Self-supervised Video Object Segmentation

The objective of this paper is self-supervised representation learning, ...
12/15/2021

Self-Supervised Monocular Depth and Ego-Motion Estimation in Endoscopy: Appearance Flow to the Rescue

Recently, self-supervised learning technology has been applied to calcul...
04/06/2023

Self-Supervised Video Similarity Learning

We introduce S^2VS, a video similarity learning approach with self-super...
02/21/2022

Self-Supervised Bulk Motion Artifact Removal in Optical Coherence Tomography Angiography

Optical coherence tomography angiography (OCTA) is an important imaging ...

Please sign up or login with your details

Forgot password? Click here to reset