Grouped Spatial-Temporal Aggregation for Efficient Action Recognition

09/28/2019
by   Chenxu Luo, et al.
0

Temporal reasoning is an important aspect of video analysis. 3D CNN shows good performance by exploring spatial-temporal features jointly in an unconstrained way, but it also increases the computational cost a lot. Previous works try to reduce the complexity by decoupling the spatial and temporal filters. In this paper, we propose a novel decomposition method that decomposes the feature channels into spatial and temporal groups in parallel. This decomposition can make two groups focus on static and dynamic cues separately. We call this grouped spatial-temporal aggregation (GST). This decomposition is more parameter-efficient and enables us to quantitatively analyze the contributions of spatial and temporal features in different layers. We verify our model on several action recognition tasks that require temporal reasoning and show its effectiveness.

READ FULL TEXT

page 1

page 8

research
07/22/2020

Video-ception Network: Towards Multi-Scale Efficient Asymmetric Spatial-Temporal Interactions

Previous video modeling methods leverage the cubic 3D convolution filter...
research
03/18/2020

STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition

Effective and Efficient spatio-temporal modeling is essential for action...
research
05/19/2018

DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding

Many of the leading approaches for video understanding are data-hungry a...
research
08/16/2019

Gradient Weighted Superpixels for Interpretability in CNNs

As Convolutional Neural Networks embed themselves into our everyday live...
research
03/02/2023

AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

We propose a novel approach for aerial video action recognition. Our met...
research
10/13/2021

Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions

The state-of-the-art deep neural networks are vulnerable to common corru...
research
08/05/2019

Discriminating Spatial and Temporal Relevance in Deep Taylor Decompositions for Explainable Activity Recognition

Current techniques for explainable AI have been applied with some succes...

Please sign up or login with your details

Forgot password? Click here to reset