Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation

11/16/2022
by   Fan Li, et al.
0

Person text-image matching, also known as text based person search, aims to retrieve images of specific pedestrians using text descriptions. Although person text-image matching has made great research progress, existing methods still face two challenges. First, the lack of interpretability of text features makes it challenging to effectively align them with their corresponding image features. Second, the same pedestrian image often corresponds to multiple different text descriptions, and a single text description can correspond to multiple different images of the same identity. The diversity of text descriptions and images makes it difficult for a network to extract robust features that match the two modalities. To address these problems, we propose a person text-image matching method by embedding text-feature interpretability and an external attack node. Specifically, we improve the interpretability of text features by providing them with consistent semantic information with image features to achieve the alignment of text and describe image region features.To address the challenges posed by the diversity of text and the corresponding person images, we treat the variation caused by diversity to features as caused by perturbation information and propose a novel adversarial attack and defense method to solve it. In the model design, graph convolution is used as the basic framework for feature representation and the adversarial attacks caused by text and image diversity on feature extraction is simulated by implanting an additional attack node in the graph convolution layer to improve the robustness of the model against text and image diversity. Extensive experiments demonstrate the effectiveness and superiority of text-pedestrian image matching over existing methods. The source code of the method is published at

READ FULL TEXT

page 1

page 4

page 10

research
08/30/2022

Image-Specific Information Suppression and Implicit Local Alignment for Text-based Person Search

Text-based person search is a challenging task that aims to search pedes...
research
08/23/2023

Progressive Feature Mining and External Knowledge-Assisted Text-Pedestrian Image Retrieval

Text-Pedestrian Image Retrieval aims to use the text describing pedestri...
research
12/13/2021

Learning Semantic-Aligned Feature Representation for Text-based Person Search

Text-based person search aims to retrieve images of a certain pedestrian...
research
01/08/2021

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

Text-based person search aims at retrieving target person in an image ga...
research
07/08/2023

Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

In visible-infrared video person re-identification (re-ID), extracting f...
research
09/11/2020

Devil's in the Detail: Graph-based Key-point Alignment and Embedding for Person Re-ID

Although Person Re-Identification has made impressive progress, difficul...
research
04/10/2023

Identity-Guided Collaborative Learning for Cloth-Changing Person Reidentification

Cloth-changing person reidentification (ReID) is a newly emerging resear...

Please sign up or login with your details

Forgot password? Click here to reset