Weakly supervised learning of actions from transcripts

10/07/2016
by   Hilde Kuehne, et al.
0

We present an approach for weakly supervised learning of human actions from video transcriptions. Our system is based on the idea that, given a sequence of input data and a transcript, i.e. a list of the order the actions occur in the video, it is possible to infer the actions within the video stream, and thus, learn the related action models without the need for any frame-based annotation. Starting from the transcript information at hand, we split the given data sequences uniformly based on the number of expected actions. We then learn action models for each class by maximizing the probability that the training video sequences are generated by the action models given the sequence order as defined by the transcripts. The learned model can be used to temporally segment an unseen video with or without transcript. We evaluate our approach on four distinct activity datasets, namely Hollywood Extended, MPII Cooking, Breakfast and CRIM13. We show that our system is able to align the scripted actions with the video data and that the learned models localize and classify actions competitively in comparison to models trained with full supervision, i.e. with frame level annotations, and that they outperform any current state-of-the-art approach for aligning transcripts with video data.

READ FULL TEXT

page 14

page 22

page 23

page 26

research
06/02/2017

Temporal Action Labeling using Action Sets

Action detection and temporal segmentation of actions in videos are topi...
research
03/23/2017

Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling

We present an approach for weakly supervised learning of human actions. ...
research
12/13/2019

Action Modifiers: Learning from Adverbs in Instructional Videos

We present a method to learn a representation for adverbs from instructi...
research
07/04/2014

Weakly Supervised Action Labeling in Videos Under Ordering Constraints

We are given a set of video clips, each one annotated with an ordered l...
research
07/28/2016

Connectionist Temporal Modeling for Weakly Supervised Action Labeling

We propose a weakly-supervised framework for action labeling in video, w...
research
10/22/2019

Weakly-Supervised Completion Moment Detection using Temporal Attention

Monitoring the progression of an action towards completion offers fine g...
research
08/25/2022

Enabling Weakly-Supervised Temporal Action Localization from On-Device Learning of the Video Stream

Detecting actions in videos have been widely applied in on-device applic...

Please sign up or login with your details

Forgot password? Click here to reset