A Perceptual Prediction Framework for Self Supervised Event Segmentation

11/12/2018
by   Sathyanarayanan N. Aakur, et al.
2

Temporal segmentation of long videos is an important problem, that has largely been tackled through supervised learning, often requiring large amounts of annotated training data. In this paper, we tackle the problem of self-supervised temporal segmentation of long videos that alleviate the need for any supervision. We introduce a self-supervised, predictive learning framework that draws inspiration from cognitive psychology to segment long, visually complex videos into individual, stable segments that share the same semantics. We also introduce a new adaptive learning paradigm that helps reduce the effect of catastrophic forgetting in recurrent neural networks. Extensive experiments on three publicly available datasets - Breakfast Actions, 50 Salads, and INRIA Instructional Videos datasets show the efficacy of the proposed approach. We show that the proposed approach is able to outperform weakly-supervised and other unsupervised learning approaches by up to 24 have competitive performance compared to fully supervised approaches. We also show that the proposed approach is able to learn highly discriminative features that help improve action recognition when used in a representation learning paradigm.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
04/29/2021

Learning Actor-centered Representations for Action Localization in Streaming Videos using Predictive Learning

Event perception tasks such as recognizing and localizing actions in str...
research
01/30/2020

Unsupervised Gaze Prediction in Egocentric Videos by Energy-based Surprise Modeling

Egocentric perception has grown rapidly with the advent of immersive com...
research
06/13/2018

Online Self-supervised Scene Segmentation for Micro Aerial Vehicles

Recently, there have been numerous advances in the development of payloa...
research
03/26/2020

Action Localization through Continual Predictive Learning

The problem of action recognition involves locating the action in the vi...
research
10/28/2019

Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future Clip Order Ranking

Deep neural networks require collecting and annotating large amounts of ...
research
07/22/2020

Instance-aware Self-supervised Learning for Nuclei Segmentation

Due to the wide existence and large morphological variances of nuclei, a...
research
09/24/2022

Self-supervised Learning for Unintentional Action Prediction

Distinguishing if an action is performed as intended or if an intended a...

Please sign up or login with your details

Forgot password? Click here to reset