Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

08/04/2016
by   Suman Saha, et al.
0

In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score actions from colour images and optical flow. In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap. In stage 3, sequences of detection boxes most likely to be associated with a single action instance, called action tubes, are constructed by solving two energy maximisation problems via dynamic programming. While in the first pass, action paths spanning the whole video are built by linking detection boxes over time using their class-specific scores and their spatial overlap, in the second pass, temporal trimming is performed by ensuring label consistency for all constituting detection boxes. We demonstrate the performance of our algorithm on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new state-of-the-art results across the board and significantly increasing detection speed at test time. We achieve a huge leap forward in action detection performance and report a 20 precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the state-of-the-art.

READ FULL TEXT

page 2

page 3

page 8

page 9

research
07/22/2017

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

Current state-of-the-art human action recognition is focused on the clas...
research
06/26/2017

YoTube: Searching Action Proposal via Recurrent and Static Regression Networks

In this paper, we present YoTube-a novel network fusion framework for se...
research
11/25/2016

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

We present a deep-learning framework for real-time multiple spatio-tempo...
research
04/03/2020

Two-Stream AMTnet for Action Detection

In this paper, we propose Two-Stream AMTnet, which leverages recent adva...
research
09/06/2022

Spatio-Temporal Action Detection Under Large Motion

Current methods for spatiotemporal action tube detection often extend a ...
research
08/23/2016

Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking

In this paper, we address the problem of searching action proposals in u...
research
03/27/2023

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

The goal of this work is to understand the way actions are performed in ...

Please sign up or login with your details

Forgot password? Click here to reset