SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

03/31/2020
by   Mohsen Fayyaz, et al.
7

Temporal action segmentation is a topic of increasing interest, however, annotating each frame in a video is cumbersome and costly. Weakly supervised approaches therefore aim at learning temporal action segmentation from videos that are only weakly labeled. In this work, we assume that for each training video only the list of actions is given that occur in the video, but not when, how often, and in which order they occur. In order to address this task, we propose an approach that can be trained end-to-end on such data. The approach divides the video into smaller temporal regions and predicts for each region the action label and its length. In addition, the network estimates the action labels for each frame. By measuring how consistent the frame-wise predictions are with respect to the temporal regions and the annotated action labels, the network learns to divide a video into class-consistent regions. We evaluate our approach on three datasets where the approach achieves state-of-the-art results.

READ FULL TEXT

page 2

page 3

page 7

research
04/05/2019

Weakly Supervised Action Segmentation Using Mutual Consistency

Action segmentation is the task of predicting the actions in each frame ...
research
01/14/2022

Transformers in Action: Weakly Supervised Action Segmentation

The video action segmentation task is regularly explored under weaker fo...
research
01/21/2021

Hierarchical Graph-RNNs for Action Detection of Multiple Activities

In this paper, we propose an approach that spatially localizes the activ...
research
07/28/2016

Connectionist Temporal Modeling for Weakly Supervised Action Labeling

We propose a weakly-supervised framework for action labeling in video, w...
research
11/22/2015

End-to-end Learning of Action Detection from Frame Glimpses in Videos

In this work we introduce a fully end-to-end approach for action detecti...
research
05/07/2020

Hierarchical Attention Network for Action Segmentation

The temporal segmentation of events is an essential task and a precursor...
research
03/29/2020

Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

We address weakly-supervised video actor-action segmentation (VAAS), whi...

Please sign up or login with your details

Forgot password? Click here to reset