From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts

11/30/2018
by   Moshiur R Farazi, et al.
14

Current Visual Question Answering (VQA) systems can answer intelligent questions about `Known' visual content. However, their performance drops significantly when questions about visually and linguistically `Unknown' concepts are presented during inference (`Open-world' scenario). A practical VQA system should be able to deal with novel concepts in real world settings. To address this problem, we propose an exemplar-based approach that transfers learning (i.e., knowledge) from previously `Known' concepts to answer questions about the `Unknown'. We learn a highly discriminative joint embedding space, where visual and semantic features are fused to give a unified representation. Once novel concepts are presented to the model, it looks for the closest match from an exemplar set in the joint embedding space. This auxiliary information is used alongside the given Image-Question pair to refine visual attention in a hierarchical fashion. Since handling the high dimensional exemplars on large datasets can be a significant challenge, we introduce an efficient matching scheme that uses a compact feature description for search and retrieval. To evaluate our model, we propose a new split for VQA, separating Unknown visual and semantic concepts from the training set. Our approach shows significant improvements over state-of-the-art VQA models on the proposed Open-World VQA dataset and standard VQA datasets.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 11

research
03/07/2022

Barlow constrained optimization for Visual Question Answering

Visual question answering is a vision-and-language multimodal task, that...
research
06/28/2020

Improving VQA and its Explanations by Comparing Competing Explanations

Most recent state-of-the-art Visual Question Answering (VQA) systems are...
research
09/30/2019

On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints

The knowledge that humans hold about a problem often extends far beyond ...
research
06/09/2020

Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?

To be reliable on rare events is an important requirement for systems ba...
research
11/20/2018

VQA with no questions-answers training

Methods for teaching machines to answer visual questions have made signi...
research
03/30/2022

FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

We present a meta-learning framework for learning new visual concepts qu...
research
11/16/2017

Learning Compositional Visual Concepts with Mutual Consistency

Compositionality of semantic concepts in image synthesis and analysis is...

Please sign up or login with your details

Forgot password? Click here to reset