DeepAI AI Chat
Log In Sign Up

Incremental Tube Construction for Human Action Detection

by   Harkirat S. Behl, et al.

Current state-of-the-art action detection systems are tailored for offline batch-processing applications. However, for online applications like human-robot interaction, current systems fall short, either because they only detect one action per video, or because they assume that the entire video is available ahead of time. In this work, we introduce a real-time and online joint-labelling and association algorithm for action detection that can incrementally construct space-time action tubes on the most challenging action videos in which different action categories occur concurrently. In contrast to previous methods, we solve the detection-window association and action labelling problems jointly in a single pass. We demonstrate superior online association accuracy and speed (2.2ms per frame) as compared to the current state-of-the-art offline systems. We further demonstrate that the entire action detection pipeline can easily be made to work effectively in real-time using our action tube construction algorithm.


page 1

page 3

page 7

page 8


Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

We present a deep-learning framework for real-time multiple spatio-tempo...

A Circular Window-based Cascade Transformer for Online Action Detection

Online action detection aims at the accurate action prediction of the cu...

Online Spatiotemporal Action Detection and Prediction via Causal Representations

In this thesis, we focus on video action understanding problems from an ...

Two-Stream AMTnet for Action Detection

In this paper, we propose Two-Stream AMTnet, which leverages recent adva...

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

Current state-of-the-art human action recognition is focused on the clas...

Exploring Temporal Context and Human Movement Dynamics for Online Action Detection in Videos

Nowadays, the interaction between humans and robots is constantly expand...

Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets

Sliding window is one direct way to extend a successful recognition syst...