Transformers in Action: Weakly Supervised Action Segmentation

01/14/2022
by   John Ridley, et al.
0

The video action segmentation task is regularly explored under weaker forms of supervision, such as transcript supervision, where a list of actions is easier to obtain than dense frame-wise labels. In this formulation, the task presents various challenges for sequence modeling approaches due to the emphasis on action transition points, long sequence lengths, and frame contextualization, making the task well-posed for transformers. Given developments enabling transformers to scale linearly, we demonstrate through our architecture how they can be applied to improve action alignment accuracy over the equivalent RNN-based models with the attention mechanism focusing around salient action transition regions. Additionally, given the recent focus on inference-time transcript selection, we propose a supplemental transcript embedding approach to select transcripts more quickly at inference-time. Furthermore, we subsequently demonstrate how this approach can also improve the overall segmentation performance. Finally, we evaluate our proposed methods across the benchmark datasets to better understand the applicability of transformers and the importance of transcript selection on this video-driven weakly-supervised task.

READ FULL TEXT

page 1

page 4

page 9

page 13

research
03/31/2020

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

Temporal action segmentation is a topic of increasing interest, however,...
research
03/23/2017

Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling

We present an approach for weakly supervised learning of human actions. ...
research
03/29/2020

Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

We address weakly-supervised video actor-action segmentation (VAAS), whi...
research
06/03/2019

A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation

Action recognition has become a rapidly developing research field within...
research
05/07/2020

Learning to Segment Actions from Observation and Narration

We apply a generative segmental model of task structure, guided by narra...
research
03/28/2018

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

In this work, we address the task of weakly-supervised human action segm...
research
08/09/2021

FIFA: Fast Inference Approximation for Action Segmentation

We introduce FIFA, a fast approximate inference method for action segmen...

Please sign up or login with your details

Forgot password? Click here to reset