A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch

08/05/2022
by   Patsorn Sangkloy, et al.
0

We address the problem of retrieving images with both a sketch and a text query. We present TASK-former (Text And SKetch transformer), an end-to-end trainable model for image retrieval using a text description and a sketch as input. We argue that both input modalities complement each other in a manner that cannot be achieved easily by either one alone. TASK-former follows the late-fusion dual-encoder approach, similar to CLIP, which allows efficient and scalable retrieval since the retrieval set can be indexed independently of the queries. We empirically demonstrate that using an input sketch (even a poorly drawn one) in addition to text considerably increases retrieval recall compared to traditional text-based image retrieval. To evaluate our approach, we collect 5,000 hand-drawn sketches for images in the test set of the COCO dataset. The collected sketches are available a https://janesjanes.github.io/tsbir/.

READ FULL TEXT

page 1

page 9

page 12

page 20

page 22

page 23

page 24

research
04/28/2018

Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch

In this work we introduce a cross modal image retrieval system that allo...
research
02/24/2020

Sketchformer: Transformer-based Representation for Sketched Structure

Sketchformer is a novel transformer-based representation for encoding fr...
research
09/14/2022

Transformers and CNNs both Beat Humans on SBIR

Sketch-based image retrieval (SBIR) is the task of retrieving natural im...
research
12/06/2018

Deep Embedding using Bayesian Risk Minimization with Application to Sketch Recognition

In this paper, we address the problem of hand-drawn sketch recognition. ...
research
03/04/2022

FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

We advance sketch research to scenes with the first dataset of freehand ...
research
09/27/2019

Query by Semantic Sketch

Sketch-based query formulation is very common in image and video retriev...
research
07/27/2022

Abstracting Sketches through Simple Primitives

Humans show high-level of abstraction capabilities in games that require...

Please sign up or login with your details

Forgot password? Click here to reset