Semi-Supervised Action Recognition with Temporal Contrastive Learning

02/04/2021
by   Ankit Singh, et al.
8

Learning to recognize actions from only a handful of labeled videos is a challenging problem due to the scarcity of tediously collected activity labels. We approach this problem by learning a two-pathway temporal contrastive model using unlabeled videos at two different speeds leveraging the fact that changing video speed does not change an action. Specifically, we propose to maximize the similarity between encoded representations of the same video at two different speeds as well as minimize the similarity between different videos played at different speeds. This way we use the rich supervisory information in terms of 'time' that is present in otherwise unsupervised pool of videos. With this simple yet effective strategy of manipulating video playback rates, we considerably outperform video extensions of sophisticated state-of-the-art semi-supervised image recognition methods across multiple diverse benchmark datasets and network architectures. Interestingly, our proposed approach benefits from out-of-domain unlabeled videos showing generalization and robustness. We also perform rigorous ablations and analysis to validate our approach.

READ FULL TEXT

page 3

page 8

page 13

page 14

page 15

research
10/28/2021

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Unsupervised domain adaptation which aims to adapt models trained on a l...
research
03/02/2023

Ego-Vehicle Action Recognition based on Semi-Supervised Contrastive Learning

In recent years, many automobiles have been equipped with cameras, which...
research
10/30/2018

Random Temporal Skipping for Multirate Video Analysis

Current state-of-the-art approaches to video understanding adopt tempora...
research
11/25/2021

Learning from Temporal Gradient for Semi-supervised Action Recognition

Semi-supervised video action recognition tends to enable deep neural net...
research
01/24/2018

Unsupervised learning from videos using temporal coherency deep networks

In this work we address the challenging problem of unsupervised learning...
research
03/21/2018

T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for Action Recognition

An action should remain identifiable when modifying its speed: consider ...
research
06/15/2015

Slow and steady feature analysis: higher order temporal coherence in video

How can unlabeled video augment visual learning? Existing methods perfor...

Please sign up or login with your details

Forgot password? Click here to reset