Multi-scale 3D Convolution Network for Video Based Person Re-Identification

11/19/2018
by   Jianing Li, et al.
0

This paper proposes a two-stream convolution network to extract spatial and temporal cues for video based person Re-Identification (ReID). A temporal stream in this network is constructed by inserting several Multi-scale 3D (M3D) convolution layers into a 2D CNN network. The resulting M3D convolution network introduces a fraction of parameters into the 2D CNN, but gains the ability of multi-scale temporal feature learning. With this compact architecture, M3D convolution network is also more efficient and easier to optimize than existing 3D convolution networks. The temporal stream further involves Residual Attention Layers (RAL) to refine the temporal features. By jointly learning spatial-temporal attention masks in a residual manner, RAL identifies the discriminative spatial regions and temporal cues. The other stream in our network is implemented with a 2D CNN for spatial feature extraction. The spatial and temporal features from two streams are finally fused for the video based person ReID. Evaluations on three widely used benchmarks datasets, i.e., MARS, PRID2011, and iLIDS-VID demonstrate the substantial advantages of our method over existing 3D convolution networks and state-of-art methods.

READ FULL TEXT
research
04/15/2021

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

Video-based person re-identification aims to match pedestrians from vide...
research
08/14/2020

Not 3D Re-ID: a Simple Single Stream 2D Convolution for Robust Video Re-identification

Video-based person re-identification has received increasing attention r...
research
08/27/2019

Global-Local Temporal Representations For Video Person Re-Identification

This paper proposes the Global-Local Temporal Representation (GLTR) to e...
research
11/09/2017

Two-stream Collaborative Learning with Spatial-Temporal Attention for Video Classification

Video classification is highly important with wide applications, such as...
research
03/16/2021

Dense Interaction Learning for Video-based Person Re-identification

Video-based person re-identification (re-ID) aims at matching the same p...
research
05/02/2019

Omni-Scale Feature Learning for Person Re-Identification

As an instance-level recognition problem, person re-identification (ReID...
research
11/22/2017

Three-Stream Convolutional Networks for Video-based Person Re-Identification

This paper aims to develop a new architecture that can make full use of ...

Please sign up or login with your details

Forgot password? Click here to reset