Feature and Region Selection for Visual Learning

07/20/2014
by   Ji Zhao, et al.
0

Visual learning problems such as object classification and action recognition are typically approached using extensions of the popular bag-of-words (BoW) model. Despite its great success, it is unclear what visual features the BoW model is learning: Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: (1) Our approach accommodates non-linear additive kernels such as the popular χ^2 and intersection kernel; (2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; (3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; (4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.

READ FULL TEXT

page 18

page 19

page 21

page 22

research
03/23/2017

A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

The traditional bag-of-words approach has found a wide range of applicat...
research
09/30/2018

Improving Bag-of-Visual-Words Towards Effective Facial Expressive Image Classification

Bag-of-Visual-Words (BoVW) approach has been widely used in the recent y...
research
07/30/2015

Action recognition in still images by latent superpixel classification

Action recognition from still images is an important task of computer vi...
research
08/02/2016

Spatio-temporal Co-Occurrence Characterizations for Human Action Classification

The human action classification task is a widely researched topic and is...
research
09/09/2016

Image and Video Mining through Online Learning

Within the field of image and video recognition, the traditional approac...
research
05/20/2020

Discriminative Dictionary Design for Action Classification in Still Images and Videos

In this paper, we address the problem of action recognition from still i...

Please sign up or login with your details

Forgot password? Click here to reset