Open-Ended Visual Question-Answering

10/09/2016
by   Issey Masuda, et al.
0

This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework. As a preliminary step, we explore Long Short-Term Memory (LSTM) networks used in Natural Language Processing (NLP) to tackle Question-Answering (text based). We then modify the previous model to accept an image as an input in addition to the question. For this purpose, we explore the VGG-16 and K-CNN convolutional neural networks to extract visual features from the image. These are merged with the word embedding or with a sentence embedding of the question to predict the answer. This work was successfully submitted to the Visual Question Answering Challenge 2016, where it achieved a 53,62 has followed the best programming practices and Python code style, providing a consistent baseline in Keras for different configurations.

READ FULL TEXT

page 1

page 15

page 20

page 40

research
12/12/2016

VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering

In this paper, we address the problem of visual question answering by pr...
research
12/07/2015

Simple Baseline for Visual Question Answering

We describe a very simple bag-of-words baseline for visual question answ...
research
11/08/2019

Are we asking the right questions in MovieQA?

Joint vision and language tasks like visual question answering are fasci...
research
09/02/2016

Skipping Word: A Character-Sequential Representation based Framework for Question Answering

Recent works using artificial neural networks based on word distributed ...
research
11/21/2015

Adding Gradient Noise Improves Learning for Very Deep Networks

Deep feedforward and recurrent networks have achieved impressive results...
research
06/10/2022

Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model

Current architectures for multi-modality tasks such as visual question a...
research
09/25/2017

Adaptive Convolutional Filter Generation for Natural Language Understanding

Convolutional neural networks (CNNs) have recently emerged as a popular ...

Please sign up or login with your details

Forgot password? Click here to reset