A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning

by   Jingjia Huang, et al.

Existing action detection algorithms usually generate action proposals through an extensive search over the video at multiple temporal scales, which brings about huge computational overhead and deviates from the human perception procedure. We argue that the process of detecting actions should be naturally one of observation and refinement: observe the current window and refine the span of attended window to cover true action regions. In this paper, we propose an active action proposal model that learns to find actions through continuously adjusting the temporal bounds in a self-adaptive way. The whole process can be deemed as an agent, which is firstly placed at a position in the video at random, adopts a sequence of transformations on the current attended region to discover actions according to a learned policy. We utilize reinforcement learning, especially the Deep Q-learning algorithm to learn the agent's decision policy. In addition, we use temporal pooling operation to extract more effective feature representation for the long temporal window, and design a regression network to adjust the position offsets between predicted results and the ground truth. Experiment results on THUMOS 2014 validate the effectiveness of the proposed approach, which can achieve competitive performance with current action detection algorithms via much fewer proposals.


Temporal Fusion Network for Temporal Action Localization:Submission to ActivityNet Challenge 2020 (Task E)

This technical report analyzes a temporal action localization method we ...

A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos

Existing approaches for spatio-temporal action detection in videos are l...

End-to-end Learning of Action Detection from Frame Glimpses in Videos

In this work we introduce a fully end-to-end approach for action detecti...

Tree-Structured Reinforcement Learning for Sequential Object Localization

Existing object proposal algorithms usually search for possible object r...

Proposal Relation Network for Temporal Action Detection

This technical report presents our solution for temporal action detectio...

ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation

Temporal action proposal generation (TAPG) aims to estimate temporal int...

Deep Point-wise Prediction for Action Temporal Proposal

Detecting actions in videos is an important yet challenging task. Previo...

Please sign up or login with your details

Forgot password? Click here to reset