REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval

06/15/2019
by   Syed Sameed Husain, et al.
2

This paper addresses the problem of very large-scale image retrieval, focusing on improving its accuracy and robustness. We target enhanced robustness of search to factors such as variations in illumination, object appearance and scale, partial occlusions, and cluttered backgrounds - particularly important when search is performed across very large datasets with significant variability. We propose a novel CNN-based global descriptor, called REMAP, which learns and aggregates a hierarchy of deep features from multiple CNN layers, and is trained end-to-end with a triplet loss. REMAP explicitly learns discriminative features which are mutually-supportive and complementary at various semantic levels of visual abstraction. These dense local features are max-pooled spatially at each layer, within multi-scale overlapping regions, before aggregation into a single image-level descriptor. To identify the semantically useful regions and layers for retrieval, we propose to measure the information gain of each region and layer using KL-divergence. Our system effectively learns during training how useful various regions and layers are and weights them accordingly. We show that such relative entropy-guided aggregation outperforms classical CNN-based aggregation controlled by SGD. The entire framework is trained in an end-to-end fashion, outperforming the latest state-of-the-art results. On image retrieval datasets Holidays, Oxford and MPEG, the REMAP descriptor achieves mAP of 95.5 respectively, outperforming any results published to date. REMAP also formed the core of the winning submission to the Google Landmark Retrieval Challenge on Kaggle.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 11

page 13

research
07/12/2019

ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval

We propose a novel CNN architecture called ACTNET for robust instance im...
research
03/03/2019

MILDNet: A Lightweight Single Scaled Deep Ranking Architecture

Multi-scale deep CNN architecture [1, 2, 3] successfully captures both f...
research
07/12/2019

ACTNET: end-to-end learning of feature activations and aggregation for effective instance image retrieval

We propose a novel CNN architecture called ACTNET for robust instance im...
research
04/21/2020

Image Retrieval using Multi-scale CNN Features Pooling

In this paper, we address the problem of image retrieval by learning ima...
research
01/11/2021

Investigating the Vision Transformer Model for Image Retrieval Tasks

This paper introduces a plug-and-play descriptor that can be effectively...
research
06/23/2018

Leveraging Implicit Spatial Information in Global Features for Image Retrieval

Most image retrieval methods use global features that aggregate local di...
research
04/13/2015

Multiple Measurements and Joint Dimensionality Reduction for Large Scale Image Search with Short Vectors - Extended Version

This paper addresses the construction of a short-vector (128D) image rep...

Please sign up or login with your details

Forgot password? Click here to reset