Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization

03/15/2023
by   Congqi Cao, et al.
0

Temporal action localization (TAL) is a prevailing task due to its great application potential. Existing works in this field mainly suffer from two weaknesses: (1) They often neglect the multi-label case and only focus on temporal modeling. (2) They ignore the semantic information in class labels and only use the visual information. To solve these problems, we propose a novel Co-Occurrence Relation Module (CORM) that explicitly models the co-occurrence relationship between actions. Besides the visual information, it further utilizes the semantic embeddings of class labels to model the co-occurrence relationship. The CORM works in a plug-and-play manner and can be easily incorporated with the existing sequence models. By considering both visual and semantic co-occurrence, our method achieves high multi-label relationship modeling capacity. Meanwhile, existing datasets in TAL always focus on low-semantic atomic actions. Thus we construct a challenging multi-label dataset UCF-Crime-TAL that focuses on high-semantic actions by annotating the UCF-Crime dataset at frame level and considering the semantic overlap of different events. Extensive experiments on two commonly used TAL datasets, i.e., MultiTHUMOS and TSU, and our newly proposed UCF-Crime-TAL demenstrate the effectiveness of the proposed CORM, which achieves state-of-the-art performance on these datasets.

READ FULL TEXT

page 1

page 9

research
03/04/2021

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Real-world videos contain many complex actions with inherent relationshi...
research
02/22/2023

BB-GCN: A Bi-modal Bridged Graph Convolutional Network for Multi-label Chest X-Ray Recognition

Multi-label chest X-ray (CXR) recognition involves simultaneously diagno...
research
12/20/2017

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

Recognizing multiple labels of images is a fundamental but challenging t...
research
07/08/2023

VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by Visual-Semantic Fusion for Egocentric Action Anticipation

Egocentric action anticipation is a challenging task that aims to make a...
research
05/01/2020

Investigating Class-level Difficulty Factors in Multi-label Classification Problems

This work investigates the use of class-level difficulty factors in mult...
research
05/23/2022

Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels

Multi-label image recognition with partial labels (MLR-PL), in which som...
research
05/31/2018

Multi-Label Transfer Learning for Semantic Similarity

The semantic relations between two short texts can be defined in multipl...

Please sign up or login with your details

Forgot password? Click here to reset