Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

by   Heng Fan, et al.

Region proposal networks (RPN) have been recently combined with the Siamese network for tracking, and shown excellent accuracy with high efficiency. Nevertheless, previously proposed one-stage Siamese-RPN trackers degenerate in presence of similar distractors and large scale variation. Addressing these issues, we propose a multi-stage tracking framework, Siamese Cascaded RPN (C-RPN), which consists of a sequence of RPNs cascaded from deep high-level to shallow low-level layers in a Siamese network. Compared to previous solutions, C-RPN has several advantages: (1) Each RPN is trained using the outputs of RPN in the previous stage. Such process stimulates hard negative sampling, resulting in more balanced training samples. Consequently, the RPNs are sequentially more discriminative in distinguishing difficult background (i.e., similar distractors). (2) Multi-level features are fully leveraged through a novel feature transfer block (FTB) for each RPN, further improving the discriminability of C-RPN using both high-level semantic and low-level spatial information. (3) With multiple steps of regressions, C-RPN progressively refines the location and shape of the target in each RPN with adjusted anchor boxes in the previous stage, which makes localization more accurate. C-RPN is trained end-to-end with the multi-task loss function. In inference, C-RPN is deployed as it is, without any temporal adaption, for real-time tracking. In extensive experiments on OTB-2013, OTB-2015, VOT-2016, VOT-2017, LaSOT and TrackingNet, C-RPN consistently achieves state-of-the-art results and runs in real-time.


page 1

page 3

page 4

page 5

page 8


SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking

By decomposing the visual tracking task into two subproblems as classifi...

SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking

Recently, the Siamese-based method has stood out from multitudinous trac...

SiamCorners: Siamese Corner Networks for Visual Tracking

The current Siamese network based on region proposal network (RPN) has a...

Distractor-aware Siamese Networks for Visual Object Tracking

Recently, Siamese networks have drawn great attention in visual tracking...

Single-Shot Two-Pronged Detector with Rectified IoU Loss

In the CNN based object detectors, feature pyramids are widely exploited...

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

Siamese networks have drawn great attention in visual tracking because o...

Cascaded Regression Tracking: Towards Online Hard Distractor Discrimination

Visual tracking can be easily disturbed by similar surrounding objects. ...

Please sign up or login with your details

Forgot password? Click here to reset