Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

10/26/2020
by   Haibo Su, et al.
0

Fashion products typically feature in compositions of a variety of styles at different clothing parts. In order to distinguish images of different fashion products, we need to extract both appearance (i.e., "how to describe") and localization (i.e.,"where to look") information, and their interactions. To this end, we propose a biologically inspired framework for image-based fashion product retrieval, which mimics the hypothesized twostream visual processing system of human brain. The proposed attentional heterogeneous bilinear network (AHBN) consists of two branches: a deep CNN branch to extract fine-grained appearance attributes and a fully convolutional branch to extract landmark localization information. A joint channel-wise attention mechanism is further applied to the extracted heterogeneous features to focus on important channels, followed by a compact bilinear pooling layer to model the interaction of the two streams. Our proposed framework achieves satisfactory performance on three image-based fashion product retrieval benchmarks.

READ FULL TEXT

page 1

page 10

research
09/18/2017

Where to Focus: Deep Attention-based Spatially Recurrent Bilinear Networks for Fine-Grained Visual Recognition

Fine-grained visual recognition typically depends on modeling subtle dif...
research
12/27/2022

Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

This paper proposes an attribute-guided multi-level attention network (A...
research
04/06/2021

Fine-Grained Fashion Similarity Prediction by Attribute-Specific Embedding Learning

This paper strives to predict fine-grained fashion similarity. In this s...
research
05/17/2023

From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval

Attribute-specific fashion retrieval (ASFR) is a challenging information...
research
11/09/2019

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Bilinear feature transformation has shown the state-of-the-art performan...
research
02/20/2018

MoNet: Moments Embedding Network

Bilinear pooling has been recently proposed as a feature encoding layer,...

Please sign up or login with your details

Forgot password? Click here to reset