Zero-Shot Co-salient Object Detection Framework

by   Haoke Xiao, et al.
Xiamen University
Vivo Communication Technology Co. Ltd.

Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The exploration of training-free zero-shot CoSOD frameworks has been limited. In this paper, taking inspiration from the zero-shot transfer capabilities of foundational computer vision models, we introduce the first zero-shot CoSOD framework that harnesses these models without any training process. To achieve this, we introduce two novel components in our proposed framework: the group prompt generation (GPG) module and the co-saliency map generation (CMP) module. We evaluate the framework's performance on widely-used datasets and observe impressive results. Our approach surpasses existing unsupervised methods and even outperforms fully supervised methods developed before 2020, while remaining competitive with some fully supervised methods developed before 2022.


page 2

page 4


Zero-Shot Object Detection by Hybrid Region Embedding

Object detection is considered as one of the most challenging problems i...

More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes

CLIP, as a foundational vision language model, is widely used in zero-sh...

CamDiff: Camouflage Image Augmentation via Diffusion Model

The burgeoning field of camouflaged object detection (COD) seeks to iden...

Representation Learning for Resource-Constrained Keyphrase Generation

State-of-the-art keyphrase generation methods generally depend on large ...

Zero-shot Synthesis with Group-Supervised Learning

Visual cognition of primates is superior to that of artificial neural ne...

Zero-shot Faithful Factual Error Correction

Faithfully correcting factual errors is critical for maintaining the int...

Multinational Address Parsing: A Zero-Shot Evaluation

Address parsing consists of identifying the segments that make up an add...

Please sign up or login with your details

Forgot password? Click here to reset