Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps

by   Xiaolin Zhang, et al.

Recently, remarkable progress has been made in weakly supervised object localization (WSOL) to promote object localization maps. The common practice of evaluating these maps applies an indirect and coarse way, i.e., obtaining tight bounding boxes which can cover high-activation regions and calculating intersection-over-union (IoU) scores between the predicted and ground-truth boxes. This measurement can evaluate the ability of localization maps to some extent, but we argue that the maps should be measured directly and delicately, i.e., comparing the maps with the ground-truth object masks pixel-wisely. To fulfill the direct evaluation, we annotate pixel-level object masks on the ILSVRC validation set. We propose to use IoU-Threshold curves for evaluating the real quality of localization maps. Beyond the amended evaluation metric and annotated object masks, this work also introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision. We propose a two-stage approach to generate the localization maps by simply comparing the similarity of point-wise features between the high-activation and the rest pixels. Based on the predicted localization maps, we explore to estimate object boundaries on a very large dataset. A hard-negative suppression loss is proposed for obtaining fine boundaries. We conduct extensive experiments on the ILSVRC and CUB benchmarks. In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88 released at


page 2

page 4

page 8

page 10

page 11

page 14

page 15

page 16


Self-produced Guidance for Weakly-supervised Object Localization

Weakly supervised methods usually generate localization results based on...

Inter-Image Communication for Weakly Supervised Localization

Weakly supervised localization aims at finding target object regions usi...

ViTOL: Vision Transformer for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) aims at predicting object l...

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

Classification activation map (CAM), utilizing the classification struct...

Weakly Supervised Lesion Localization With Probabilistic-CAM Pooling

Localizing thoracic diseases on chest X-ray plays a critical role in cli...

Hierarchical Complementary Learning for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) is a challenging problem wh...

Deep Learning for Morphological Identification of Extended Radio Galaxies using Weak Labels

The present work discusses the use of a weakly-supervised deep learning ...

Please sign up or login with your details

Forgot password? Click here to reset