Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

10/03/2018
by   Hyeonwoo Noh, et al.
0

We study how to leverage off-the-shelf visual and linguistic data to cope with out-of-vocabulary answers in visual question answering. Existing large-scale visual data with annotations such as image class labels, bounding boxes and region descriptions are good sources for learning rich and diverse visual concepts. However, it is not straightforward how the visual concepts should be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question or task specification. We tackle this problem in two steps: 1) learning a task conditional visual classifier based on unsupervised task discovery and 2) transferring and adapting the task conditional visual classifier to visual question answering models. Specifically, we employ linguistic knowledge sources such as structured lexical database (e.g. Wordnet) and visual descriptions for unsupervised task discovery, and adapt a learned task conditional visual classifier to answering unit in a visual question answering model. We empirically show that the proposed algorithm generalizes to unseen answers successfully using the knowledge transferred from the visual data.

READ FULL TEXT

page 2

page 7

page 8

page 12

page 14

research
09/23/2018

Textually Enriched Neural Module Networks for Visual Question Answering

Problems at the intersection of language and vision, like visual questio...
research
09/04/2019

Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering

Object detection plays an important role in current solutions to vision ...
research
12/27/2021

Multi-Image Visual Question Answering

While a lot of work has been done on developing models to tackle the pro...
research
06/10/2018

Learning Answer Embeddings for Visual Question Answering

We propose a novel probabilistic model for visual question answering (Vi...
research
05/04/2020

Visual Question Answering with Prior Class Semantics

We present a novel mechanism to embed prior knowledge in a model for vis...
research
07/17/2020

Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions

To understand movies, humans constantly reason over the dialogues and ac...
research
10/21/2020

Unsupervised Deep Learning based Multiple Choices Question Answering: Start Learning from Basic Knowledge

In this paper, we study the possibility of almost unsupervised Multiple ...

Please sign up or login with your details

Forgot password? Click here to reset