Open-vocabulary Panoptic Segmentation with Embedding Modulation

03/20/2023
by   Xi Chen, et al.
3

Open-vocabulary image segmentation is attracting increasing attention due to its critical applications in the real world. Traditional closed-vocabulary segmentation methods are not able to characterize novel objects, whereas several recent open-vocabulary attempts obtain unsatisfactory results, i.e., notable performance reduction on the closed vocabulary and massive demand for extra data. To this end, we propose OPSNet, an omnipotent and data-efficient framework for Open-vocabulary Panoptic Segmentation. Specifically, the exquisitely designed Embedding Modulation module, together with several meticulous components, enables adequate embedding enhancement and information exchange between the segmentation model and the visual-linguistic well-aligned CLIP encoder, resulting in superior segmentation performance under both open- and closed-vocabulary settings with much fewer need of additional data. Extensive experimental evaluations are conducted across multiple datasets (e.g., COCO, ADE20K, Cityscapes, and PascalContext) under various circumstances, where the proposed OPSNet achieves state-of-the-art results, which demonstrates the effectiveness and generality of the proposed approach. The code and trained models will be made publicly available.

READ FULL TEXT

page 1

page 3

page 7

page 8

research
05/23/2023

3D Open-vocabulary Segmentation with Foundation Models

Open-vocabulary segmentation of 3D scenes is a fundamental function of h...
research
07/03/2023

Hierarchical Open-vocabulary Universal Image Segmentation

Open-vocabulary image segmentation aims to partition an image into seman...
research
08/22/2023

Opening the Vocabulary of Egocentric Actions

Human actions in egocentric videos are often hand-object interactions co...
research
08/22/2023

Dynamic Open Vocabulary Enhanced Safe-landing with Intelligence (DOVESEI)

This work targets what we consider to be the foundational step for urban...
research
08/04/2023

Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

Open-vocabulary segmentation is a challenging task requiring segmenting ...
research
01/02/2023

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

In this work, we focus on instance-level open vocabulary segmentation, i...
research
08/06/2020

Few-Shot Drum Transcription in Polyphonic Music

Data-driven approaches to automatic drum transcription (ADT) are often l...

Please sign up or login with your details

Forgot password? Click here to reset