DeepAI AI Chat
Log In Sign Up

Feature sampling and partitioning for visual vocabulary generation on large action classification datasets

05/29/2014
by   Michael Sapienza, et al.
0

The recent trend in action recognition is towards larger datasets, an increasing number of action classes and larger visual vocabularies. State-of-the-art human action classification in challenging video data is currently based on a bag-of-visual-words pipeline in which space-time features are aggregated globally to form a histogram. The strategies chosen to sample features and construct a visual vocabulary are critical to performance, in fact often dominating performance. In this work we provide a critical evaluation of various approaches to building a vocabulary and show that good practises do have a significant impact. By subsampling and partitioning features strategically, we are able to achieve state-of-the-art results on 5 major action recognition datasets using relatively small visual vocabularies.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/23/2017

A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

The traditional bag-of-words approach has found a wide range of applicat...
01/28/2015

Feature Sampling Strategies for Action Recognition

Although dense local spatial-temporal features with bag-of-features repr...
05/18/2014

Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice

Video based action recognition is one of the important and challenging p...
06/06/2021

Transformed ROIs for Capturing Visual Transformations in Videos

Modeling the visual changes that an action brings to a scene is critical...
04/07/2020

Temporal Pyramid Network for Action Recognition

Visual tempo characterizes the dynamics and the temporal scale of an act...
05/26/2021

Anticipating human actions by correlating past with the future with Jaccard similarity measures

We propose a framework for early action recognition and anticipation by ...