Explicit Visual Prompting for Low-Level Structure Segmentations

03/20/2023
by   Weihuang Liu, et al.
0

We consider the generic problem of detecting low-level structures in images, which includes segmenting the manipulated parts, identifying out-of-focus pixels, separating shadow regions, and detecting concealed objects. Whereas each such topic has been typically addressed with a domain-specific solution, we show that a unified approach performs well across all of them. We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP). Different from the previous visual prompting which is typically a dataset-level implicit embedding, our key insight is to enforce the tunable parameters focusing on the explicit visual content from each individual image, i.e., the features from frozen patch embeddings and the input's high-frequency components. The proposed EVP significantly outperforms other parameter-efficient tuning protocols under the same amount of tunable parameters (5.7 state-of-the-art performances on diverse low-level structure segmentation tasks compared to task-specific solutions. Our code is available at: https://github.com/NiFangBaAGe/Explicit-Visual-Prompt.

READ FULL TEXT

page 1

page 6

page 8

page 15

page 16

page 17

page 18

research
05/29/2023

Explicit Visual Prompting for Universal Foreground Segmentations

Foreground segmentation is a fundamental problem in computer vision, whi...
research
01/02/2023

Task-specific Scene Structure Representations

Understanding the informative structures of scenes is essential for low-...
research
11/18/2022

Task Residual for Tuning Vision-Language Models

Large-scale vision-language models (VLMs) pre-trained on billion-level d...
research
08/03/2023

Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives

Multi-task learning (MTL) seeks to learn a single model to accomplish mu...
research
03/30/2021

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering

We present a new framework for semantic segmentation without annotations...
research
12/19/2021

On Efficient Transformer and Image Pre-training for Low-level Vision

Pre-training has marked numerous state of the arts in high-level compute...
research
07/23/2022

Contrastive Monotonic Pixel-Level Modulation

Continuous one-to-many mapping is a less investigated yet important task...

Please sign up or login with your details

Forgot password? Click here to reset