LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

05/08/2023
by   Peng Xia, et al.
0

Long-tailed multi-label visual recognition (LTML) task is a highly challenging task due to the label co-occurrence and imbalanced data distribution. In this work, we propose a unified framework for LTML, namely prompt tuning with class-specific embedding loss (LMPT), capturing the semantic feature interactions between categories by combining text and image modality data and improving the performance synchronously on both head and tail classes. Specifically, LMPT introduces the embedding loss function with class-aware soft margin and re-weighting to learn class-specific contexts with the benefit of textual descriptions (captions), which could help establish semantic relationships between classes, especially between the head and tail classes. Furthermore, taking into account the class imbalance, the distribution-balanced loss is adopted as the classification loss function to further improve the performance on the tail classes without compromising head classes. Extensive experiments are conducted on VOC-LT and COCO-LT datasets, which demonstrates that the proposed method significantly surpasses the previous state-of-the-art methods and zero-shot CLIP in LTML. Our codes are fully available at <https://github.com/richard-peng-xia/LMPT>.

READ FULL TEXT

page 3

page 7

page 8

page 15

research
06/12/2023

Feature Fusion from Head to Tail: an Extreme Augmenting Strategy for Long-Tailed Visual Recognition

The imbalanced distribution of long-tailed data poses a challenge for de...
research
06/16/2023

Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation

This paper investigates the problem of scene graph generation in videos ...
research
07/06/2021

Predicate correlation learning for scene graph generation

For a typical Scene Graph Generation (SGG) method, there is often a larg...
research
08/15/2023

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

Class imbalance is a common challenge in real-world recognition tasks, w...
research
08/27/2022

Multi-Outputs Is All You Need For Deblur

Image deblurring task is an ill-posed one, where exists infinite feasibl...
research
09/10/2021

Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution

Multi-label text classification is a challenging task because it require...
research
06/20/2022

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Solving multi-label recognition (MLR) for images in the low-label regime...

Please sign up or login with your details

Forgot password? Click here to reset