Visual Word Selection without Re-Coding and Re-Pooling

07/23/2014
by   Fatih Cakir, et al.
0

The Bag-of-Words (BoW) representation is widely used in computer vision. The size of the codebook impacts the time and space complexity of the applications that use BoW. Thus, given a training set for a particular computer vision task, a key problem is pruning a large codebook to select only a subset of visual words. Evaluating possible selections of words to be included in the pruned codebook can be computationally prohibitive; in a brute-force scheme, evaluating each pruned codebook requires re-coding of all features extracted from training images to words in the candidate codebook and then re-pooling the words to obtain a representation of each image, e.g., histogram of visual word frequencies. In this paper, a method is proposed that selects and evaluates a subset of words from an initially large codebook, without the need for re-coding or re-pooling. Formulations are proposed for two commonly-used schemes: hard and soft (kernel) coding of visual words with average-pooling. The effectiveness of these formulations is evaluated on the 15 Scenes and Caltech 10 benchmarks.

READ FULL TEXT
research
07/02/2020

Learning ordered pooling weights in image classification

Spatial pooling is an important step in computer vision systems like Con...
research
07/07/2022

Predicting Word Learning in Children from the Performance of Computer Vision Systems

For human children as well as machine learning systems, a key challenge ...
research
09/30/2018

Improving Bag-of-Visual-Words Towards Effective Facial Expressive Image Classification

Bag-of-Visual-Words (BoVW) approach has been widely used in the recent y...
research
03/23/2017

A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

The traditional bag-of-words approach has found a wide range of applicat...
research
10/18/2017

Identifying Mild Traumatic Brain Injury Patients From MR Images Using Bag of Visual Words

Mild traumatic brain injury (mTBI) is a growing public health problem wi...
research
01/15/2013

Auto-pooling: Learning to Improve Invariance of Image Features from Image Sequences

Learning invariant representations from images is one of the hardest cha...
research
01/24/2017

Sparse models for Computer Vision

The representation of images in the brain is known to be sparse. That is...

Please sign up or login with your details

Forgot password? Click here to reset