Hierarchical Matching and Reasoning for Multi-Query Image Retrieval

06/26/2023
by   Zhong Ji, et al.
nwpu.edu.cn
Tianjin University
Baidu, Inc.
0

As a promising field, Multi-Query Image Retrieval (MQIR) aims at searching for the semantically relevant image given multiple region-specific text queries. Existing works mainly focus on a single-level similarity between image regions and text queries, which neglects the hierarchical guidance of multi-level similarities and results in incomplete alignments. Besides, the high-level semantic correlations that intrinsically connect different region-query pairs are rarely considered. To address above limitations, we propose a novel Hierarchical Matching and Reasoning Network (HMRN) for MQIR. It disentangles MQIR into three hierarchical semantic representations, which is responsible to capture fine-grained local details, contextual global scopes, and high-level inherent correlations. HMRN comprises two modules: Scalar-based Matching (SM) module and Vector-based Reasoning (VR) module. Specifically, the SM module characterizes the multi-level alignment similarity, which consists of a fine-grained local-level similarity and a context-aware global-level similarity. Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity. Finally, these three-level similarities are aggregated into a joint similarity space to form the ultimate similarity. Extensive experiments on the benchmark dataset demonstrate that our HMRN substantially surpasses the current state-of-the-art methods. For instance, compared with the existing best method Drill-down, the metric R@1 in the last round is improved by 23.4 https://github.com/LZH-053/HMRN.

READ FULL TEXT

page 4

page 13

12/16/2022

HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval

Image-text retrieval (ITR) is a challenging task in the field of multimo...
06/11/2021

Step-Wise Hierarchical Alignment Network for Image-Text Matching

Image-text matching plays a central role in bridging the semantic gap be...
10/21/2022

Reusing Keywords for Fine-grained Representations and Matchings

Question retrieval aims to find the semantically equivalent questions fo...
01/05/2021

Similarity Reasoning and Filtration for Image-Text Matching

Image-text matching plays a critical role in bridging the vision and lan...
08/30/2019

Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid

Matching clothing images from customers and online shopping stores has r...
03/08/2020

Adaptive Semantic-Visual Tree for Hierarchical Embeddings

Merchandise categories inherently form a semantic hierarchy with differe...
01/17/2023

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval

As a fundamental and challenging task in bridging language and vision do...

Please sign up or login with your details

Forgot password? Click here to reset