CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval

by   Zijie Wang, et al.

Given a natural language description, text-based person retrieval aims to identify images of a target person from a large-scale person image database. Existing methods generally face a color over-reliance problem, which means that the models rely heavily on color information when matching cross-modal data. Indeed, color information is an important decision-making accordance for retrieval, but the over-reliance on color would distract the model from other key clues (e.g. texture information, structural information, etc.), and thereby lead to a sub-optimal retrieval performance. To solve this problem, in this paper, we propose to Capture All-round Information Beyond Color (CAIBC) via a jointly optimized multi-branch architecture for text-based person retrieval. CAIBC contains three branches including an RGB branch, a grayscale (GRS) branch and a color (CLR) branch. Besides, with the aim of making full use of all-round information in a balanced and effective way, a mutual learning mechanism is employed to enable the three branches which attend to varied aspects of information to communicate with and learn from each other. Extensive experimental analysis is carried out to evaluate our proposed CAIBC method on the CUHK-PEDES and RSTPReid datasets in both supervised and weakly supervised text-based person retrieval settings, which demonstrates that CAIBC significantly outperforms existing methods and achieves the state-of-the-art performance on all the three tasks.


page 2

page 3

page 4

page 8


Mining False Positive Examples for Text-Based Person Re-identification

Text-based person re-identification (ReID) aims to identify images of th...

DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

Many previous methods on text-based person retrieval tasks are devoted t...

TVPR: Text-to-Video Person Retrieval and a New Benchmark

Most existing methods for text-based person retrieval focus on text-to-i...

Weakly Supervised Domain-Specific Color Naming Based on Attention

The majority of existing color naming methods focuses on the eleven basi...

Multi-Branch with Attention Network for Hand-Based Person Recognition

In this paper, we propose a novel hand-based person recognition method f...

Generative One-Class Models for Text-based Person Retrieval in Forensic Applications

Automatic forensic image analysis assists criminal investigation experts...

XFormer: Fast and Accurate Monocular 3D Body Capture

We present XFormer, a novel human mesh and motion capture method that ac...

Please sign up or login with your details

Forgot password? Click here to reset