Survey of Visual Question Answering: Datasets and Techniques

05/10/2017
by   Akshay Kumar Gupta, et al.
0

Visual question answering (or VQA) is a new and exciting problem that combines natural language processing and computer vision techniques. We present a survey of the various datasets and models that have been used to tackle this task. The first part of the survey details the various datasets for VQA and compares them along some common factors. The second part of this survey details the different approaches for VQA, classified into four types: non-deep learning models, deep learning models without attention, deep learning models with attention, and other models which do not fit into the first three. Finally, we compare the performances of these approaches and provide some directions for future work.

READ FULL TEXT

page 2

page 3

research
08/27/2019

Visual Question Answering using Deep Learning: A Survey and Performance Analysis

The Visual Question Answering (VQA) task combines challenges for process...
research
03/21/2018

Attention on Attention: Architectures for Visual Question Answering (VQA)

Visual Question Answering (VQA) is an increasingly popular topic in deep...
research
11/16/2021

Language bias in Visual Question Answering: A Survey and Taxonomy

Visual question answering (VQA) is a challenging task, which has attract...
research
08/17/2019

U-CAM: Visual Explanation using Uncertainty based Class Activation Maps

Understanding and explaining deep learning models is an imperative task....
research
12/19/2019

Deep Exemplar Networks for VQA and VQG

In this paper, we consider the problem of solving semantic tasks such as...
research
06/05/2017

Deep learning evaluation using deep linguistic processing

We discuss problems with the standard approaches to evaluation for tasks...
research
04/13/2021

Neuro-Symbolic VQA: A review from the perspective of AGI desiderata

An ultimate goal of the AI and ML fields is artificial general intellige...

Please sign up or login with your details

Forgot password? Click here to reset