Learning to Count Objects in Natural Images for Visual Question Answering

02/15/2018
by   Yan Zhang, et al.
0

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6

READ FULL TEXT

page 15

page 16

page 17

research
05/21/2018

Reproducibility Report for "Learning To Count Objects In Natural Images For Visual Question Answering"

This is the reproducibility report for the paper "Learning To Count Obje...
research
12/23/2017

Interpretable Counting for Visual Question Answering

Questions that require counting a variety of objects in images remain a ...
research
10/29/2018

TallyQA: Answering Complex Counting Questions

Most counting questions in visual question answering (VQA) datasets are ...
research
04/24/2020

Revisiting Modulated Convolutions for Visual Counting and Beyond

This paper targets at visual counting, where the setup is to estimate th...
research
04/03/2022

Question-Driven Graph Fusion Network For Visual Question Answering

Existing Visual Question Answering (VQA) models have explored various vi...
research
03/31/2021

Analysis on Image Set Visual Question Answering

We tackle the challenge of Visual Question Answering in multi-image sett...
research
11/19/2019

Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA

In this paper, we aim to obtain improved attention for a visual question...

Please sign up or login with your details

Forgot password? Click here to reset